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polymerase possesses the properties of 1) exhibiting 
higher polymerase activity when assayed by using as a 
substrate a complex resulting from primer annealing to 
a single stranded template DNA, as compared to the 
case where an activated DNA is used as a substrate; 2) 
possessing a 3'-*5* exonuclease activity; 3) being capa- 
ble of amplifying a DNA fragment of about 20 kbp, in the 
case where polymerase chain reaction (PCR) is carried 
out using X-DNA as a template. It also relates to a DNA 
polymerase-constituting protein; a DNA containing the 
base sequence encoding thereof; and a method for pro- 
ducing the DNA polymerase. The present invention pro- 
vides a novel DNA polymerase possessing both a high 
primer extensibility and a 3'->5' exonuclease activity. 
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Description 
TECHNICAL FIELD 

5 The present invention relates a DN A polymerase which is useful for a reagent for genetic engineering, a method for 
producing the same, and a gene encoding thereof. 

BACKGROUND ART 

DNA polymerases are useful enzymes for reagents for genetic engineering, and the DNA polymerases are widely 
used for a method for determining a base sequence of DNA, labeling, a method of site-directed mutagenesis, and the 
like. Also, thermostable DNA polymerases have recently been remarked with the development of the polymerase chain 
reaction (PCR) method, and various DNA polymerases suitable for the PCR method have been developed and com- 
mercialized. 

Presently known DNA polymerases can be roughly classified into four families according to amino acid sequence 
homologies, among which family A (pot I type enzymes) and family B (a type enzymes) account for the great majority. 
Although DNA polymerases belonging to each family generally possess mutually similar biochemical properties, 
detailed comparison reveals that individual DNA polymerase enzymes differ from each other in terms of substrate spe- 
cificity, substrate analog incorporation, degree and rate for primer extension, mode of DNA synthesis, association of 
exonuclease activity, optimum reaction conditions of temperature, pH and the like, and sensitivity to inhibitors. Thus, 
those possessing especially suitable properties for the respective experimental procedures have been selectively used 
of all available DNA polymerases. 

DI SC L OSU RE QF INVENTION 

25 

An object of the present invention is to provide a novel DNA polymerase not belonging to any of the above families, 
and possessing biochemical properties not owned by any of the existing DNA polymerases. For example, primer exten- 
sion activity and 3'-»5' exonuclease activity are considered as mutually opposite properties, and none of the existing 
DNA polymerase enzymes with strong primer extension activity possess 3'->5' exonuclease activity, which is an impor- 
30 tant proofreading function for DNA synthesis accuracy. Also, the existing DNA polymerases possessing excellent proof- 
reading functions are poor in primer extension activity. Therefore, development of a DNA polymerase possessing both 
potent primer extension activity and potent 3'->5' exonuclease activity would significantly contribute to DNA synthesis 
in vitro. 

Another object of the present invention is to provide a method for producing the novel DNA polymerase mentioned 
35 above. 

A still another object of the present invention is to provide a gene encoding the DNA polymerase of the present 
invention. 

As a result of extensive investigation, the present inventors have found genes of the novel DNA polymerase from 
hyperthermophilic arcaebacterium Pyrrococcus furious, followed by cloning of the above genes, and have clarified that 
40 two kinds of novel proteins possess a novel DNA polymerase activity exhibiting the activity under coexistence of the 
above two kinds of proteins. Furthermore, the present inventors have prepared a transformant into which the above 
genes are introduced, and have succeeded in mass-producing the complex type DNA polymerase. 

Accordingly, the gist of the present invention is as follows: 

45 [1] A DNA polymerase characterized in that the DNA polymerase possesses the following properties: 

1) exhibiting higher polymerase activity when assayed by using as a substrate a complex resulting from primer 
annealing to a single stranded template DNA, as compared to the case where an activated DNA is used as a 
substrate; 

so 2) possessing a 3'-»5* exonuclease activity; 

3) being capable of amplifying a DNA fragment of about 20 kbp, in the case where polymerase chain reaction 
(PCR) is carried out using A.-DNA as a template under the following conditions: 

PCR conditions: 

55 

(a) a composition of reaction mixture: containing 10 mM Tris-HCI (pH 9.2), 3.5 mM MgCI 2 , 75 mM KCI, 400 uM 
each of dATP, dCTP, dGTP and dTTR 0.01% bovine serum albumin, 0.1% Triton X-100. 5.0 ng/50 jil X-DNA, 
10 pmole/50 \i\ primer X1 (SEQ ID NO:8 in Sequence Listing), primer X1 1 (SEQ ID NO:9 in Sequence Listing), 
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and 3.7 units/50 jal DNA polymerase; 

(b) reaction conditions: carrying out a 30-cycle PCR, wherein one cycle is defined as at 98°C for 10 secpnds 
and at 68°C for 10 minutes; 

5 [2] Hie DNA polymerase according to the above item [1], characterized in that the DNA polymerase exhibits a lower 
error rate in DNA synthesis as compared to Taq DNA polymerase; 

[3] The DNA polymerase according to the above item [1] or [2], wherein the molecular weight as determined by gel 
filtration method is about 220 kDa or about 385 kDa; 

[4] The DNA polymerase according to any one of the above items [1] to [3], characterized in that the DNA polymer- 
w ase exhibits an activity under coexistence of two kinds of DNA polymerase-constituting protein, a first DNA 
polymerase^onstituting protein and a second DNA polymerase-constituting protein; 

[5] The DNA polymerase according to the above item [4], characterized in that the molecular weights of the first 
DNA polymerase-constituting protein and the second DNA polymerase-constituting protein are about 30,000 Da 
and about 140,000 Da as determined by SDS-PAGE, respectively; 

is [6] The DNA polymerase according to the above item [4] or [5], characterized in that the first DNA polymerase-con- 
stituting protein which constitutes the DNA polymerase according to the above item [4] or [5] comprises the amino 
acid sequence as shown by SEQ ID NO: 1 in Sequence Listing, or is a functional equivalent thereof possessing sub- 
stantially the same activity which results from deletion, insertion, addition or substitution of one or more amino 
acids in the amino acid sequence; 

20 [7] The DNA polymerase according to the above item [4] or [5], characterized in that the second DNA polymerase- 
constituting protein which constitutes the DNA polymerase according to the above item [4] or [5] comprises the 
amino acid sequence as shown by SEQ ID NO:3 in sequence Listing, or is a functional equivalent thereof possess- 
ing substantially the same activity which results from deletion, insertion, addition or substitution of one or more 
amino acids in the amino acid sequence; 

25 [8] The DNA polymerase according to item [4] or [5], characterized in that the first DNA polymerase-constituting 
protein which constitutes the DNA polymerase according to the above item [4] or [5] comprises the amino acid 
sequence as shown by SEQ ID NO:1 in Sequence Listing, or is a functional equivalent thereof possessing substan- 
tially the same activity which results from deletion, insertion, addition or substitution of one or more amino acids in 
the amino acid sequence, and that the second DNA polymerase-constituting protein which constitutes the DNA 

30 polymerase according to the above item [4] or [5] comprises the amino acid sequence as shown by SEQ ID NO:3 
in Sequence Listing, or is a functional equivalent thereof possessing substantially the same activity which results 
from deletion, insertion, addition or substitution of one or more amino acids in the amino acid sequence; 
[9] A first DNA polymerase-constituting protein which constitutes the DNA polymerase according to the above item 
[4] or [5J. wherein the first DNA polymerase-constituting protein comprises the amino acid sequence as shown by 

35 SEQ ID NO:1 , or an amino acid sequence resulting from deletion, insertion, addition or substitution of one or more 
amino acids in the amino acid sequence, as a functional equivalent thereof possessing substantially the same 
activity; 

[10] A second DNA polymerase-constituting protein which constitutes the DNA polymerase according to the above 
[4] or [5], wherein the second DNA polymerase-constituting protein comprises the amino acid sequence as shown 
40 by SEQ ID NO:3, or an amino acid sequence resulting from deletion, insertion, addition or substitution of one or 
more amino acids in the amino acid sequence as a functional equivalent thereof possessing substantially the same 
activity; 

[1 1] A DNA containing a base sequence encoding the first DNA polymerase-constituting protein according to the 
above item [9], characterized jn that the DNA comprises an entire sequence of a base sequence encoding the 
45 amino acid sequence as shown by SEQ ID NO:1 in Sequence Listing, or a partial sequence thereof, or that the 
DNA encodes a protein having an amino acid sequence resulting from deletion, insertion, addition or substitution 
of one or more amino acids in the amino acid sequence of SEQ ID NO:1 and possessing a function as the first DNA 
polymerase-constituting protein; 

[12] A DNA containing a base sequence encoding the first DNA polymerase-constituting protein according to the 
so above items [9], characterized in that the DNA comprises an entire sequence of the base sequence as shown by 
SEQ ID NO:2 in Sequence Listing or a partial sequence thereof, or that the DNA comprises a base sequence capa- 
ble of hybridizing thereto under stringent conditions; 

[13] A DNA containing a base sequence encoding the second DNA polymerase-constituting protein according to 
the above item [1 0], characterized in that the DNA comprises an entire sequence of a base sequence encoding the 
55 amino acid sequence as shown by SEQ ID NO:3, or a partial sequence thereof, or that the DNA encodes a protein 
having an amino acid sequence resulting from deletion, insertion, addition or substitution of one or more amino 
acids in the amino acid sequence of SEQ ID NO:3 and possessing a function as the second DNA polymerase-con- 
stituting protein; 
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[14] A DNA containing a base sequence encoding the second DNA polymerase-constrtuting protein according to 
the item [10], characterized in that the DNA comprises an entire sequence of the base sequence as shown by SEQ 
ID NO:4 in Sequence Listing or a partial sequence thereof, or that the DNA comprises a base sequence capable 
of hybridizing thereto under stringent conditions; 
5 [1 5] A method for producing a DNA polymerase, characterized in that trie method comprises cutturing a transform- 
ant containing both gene encoding the first DNA polymerase-constrtuting protein according to the above item [1 1] 
or [12], and gene encoding the second DNA polymerase-constrtuting protein according to the above item [13] or 
[14], and collecting the DNA polymerase from the resulting culture; and 

[16] A method for producing a DNA polymerase, characterized in that the method comprises curturing a transform- 
10 ant containing gene encoding the first DNA polymerase-constituting protein according to the above item [11] or 
[12], and a transformant containing gene encoding the second DNA polymerase-constrtuting protein according to 
the above item [13] or [14], separately; mixing DNA polymerase-constituting proteins contained in the resulting cul- 
ture; and collecting the DNA polymerase. 

15 BRIEF DESCRIPTION OF DRAWINGS 

Figure 1 shows a restriction endonuclease map of the DNA fragment inserted into the cosmid Clone No. 264 and 
the cosmid Clone No. 491 obtained in Example 1. 

Figure 2 shows a restriction endonuclease map of an Xba\-Xba\ DNA fragment inserted into a plasmid pFU1001 . 
20 Figure 3 is a graph for an optimum pH of the DNA polymerase of the present invention. 
Figure 4 is a graph for a heat stability of the DNA polymerase of the present invention. 
Figure 5 is a graph for a 3'->5' exonuclease activity of the DNA polymerase of the present invention. 
Figure 6 is an autoradiogram for a primer extension activity of the DNA polymerase of the present invention. 

25 BEST MODE FOR CARRYING OUT THE INVENTION 

(1) DNA Polymerase of Present Invention and Constituting Proteins Thereof 

An example of the DNA polymerase of the present invention has the following properties: 

30 

1) exhibiting higher polymerase activity when assayed by using as a substrate a complex resulting from primer 
annealing to a single stranded template DNA, as compared to the case where an activated DNA (DNase l-treated 
calf thymus DNA) is used as a substrate; 

2) possessing a 3'->5' exonuclease activity; 

35 3) optimum pH being between 6.5 and 7.0 (in potassium phosphate buffer, at 75°C); 

4) exhibiting a remaining activity of about 80% after heat treatment at 80°C for 30 minutes; 

5) being capable of amplifying a DNA fragment of about 20 kbp, in the case where polymerase chain reaction 
(PCR) is carried out using X-DNA as a template under the following conditions: 

PCR conditions: 

40 

(a) composition of reaction mixture: containing 10 mM Tris-HCI (pH 9.2), 3.5 mM MgCI 2 , 75 mM KCI, 400 jiM 
each of dATP, dCTP, dGTP and dTTR 0.01% bovine serum albumin (BSA), 0.1% Triton X-100, 5.0 ng/50 nl X- 
DNA, 10 pmole/50 id primer XI (SEQ ID NO:8 in Sequence Listing), primer V1 1 (SEQ ID NO:9 in Sequence 
Listing), and 3.7 units/50 uJ DNA polymerase. Here, one unit of the DNA polymerase is defined as follows: Fifty 

45 microliters of a reaction mixture [20 mM Tris-HCI (pH 7.7), 1 5 mM MgCI 2 , 2 mM 2-mercaptoethanol, 0.2 mg/ml 

activated DNA, 40 \M each of dATP, dCTP, dGTP and dTTP, 60 nM [ 3 H]-dTTP (manufactured by Amersham)], 
containing a sample to assay activity, is reacted at 75°C for 15 minutes. A 40 ^1 portion of this reaction mixture 
is spotted onto a DE paper (manufactured by Whatman) and washed with 5% Na 2 HP0 4 five times. Thereafter, 
the remaining radioactivity on the DE paper is measured using a liquid scintillation counter, and the amount of 

so the enzyme causing the incorporation of 10 nmol of [ 3 H]-dTMP per 30 minutes into a substrate DNA is defined 

as one unit of the enzyme; and 

(b) PCR conditions: carrying out a 30-cycle PCR, wherein one cycle is defined as at 98° C for 10 seconds and 
at 68°C for 10 minutes; and 

55 6) The DNA polymerase of the present invention is superior to the Taq DNA polymerase in terms of both primer 
extension activity and accuracy of DNA synthesis. Specifically, the DNA polymerase of the present invention is 
superior to the Taq DNA polymerase, a typical thermostable DNA polymerase (e.g., TaKaRa Taq, manufactured by 
Takara Shuzo Co., Ltd.), in terms of primer extension properties in DNA synthesis reaction, for instance, DNA 
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strand length capable of DNA amplification by PCR method, and accuracy of DNA synthesis reaction (low error rate 
in DNA synthesis). 

The DNA polymerase of the present invention is an enzyme constituted by two kinds of proteins, wherein a molec- 
s ular weight of the DNA polymerase of the present invention is about 220 kDa or about 385 kDa, as determined by gel 
filtration, and also shown by two bands corresponding to about 90,000 Da and about 140,000 Da on SDS-PAGE, 
respectively. The protein of about 90,000 Da (corresponding to ORF3 as described below) is herein referred to as the 
first DNA polymerase-constituting protein, and the protein of about 140.000 Da (corresponding to ORF4 as described 
below) is herein referred to as the second DNA polymerase-constituting protein. It is assumed that in the DNA polymer- 
ic? ase of the present invention, the first DNA polymerase-constituting protein and the second DNA polymerase-constitut- 
ing protein are non-covalently bonded to form a complex in a molar ratio of 1 :1 or 1 :2. 

The first DNA polymerase-constituting protein which constitutes the DNA polymerase of the present invention may 
comprise the amino acid sequence shown by SEQ ID NO:1 in Sequence Listing, or may be a functional equivalent pos- 
sessing substantially the same activity. Also, the second DNA polymerase-constituting protein may comprise the amino 
is acid sequence shown by SEQ ID NO:3 in Sequence Listing, or may be a functional equivalent possessing substantially 
the same activity. 

The term "a functional equivalent" as described in the present specification is defined as follows. A protein existing 
in nature can undergo mutation, such as deletion, insertion, addition and substitution, of amino acids in an amino acid 
sequence thereof owing to modification reaction and the like of the protein itself in vivo or during purification, besides 

20 causation such as polymorphism and mutation of the genes encoding it. However, it has been known that there are 
some proteins which exhibit substantially the same physiological activities or biological activities as a protein without 
mutation. Those proteins having structural differences as described above without recognizing any significant differ- 
ences of the functions and the activities thereof, are referred to as "a functional equivalent.'' Here, the number of 
mutated amino acids is not particularly limited, as long as the resulting protein exhibits substantially the same physio- 

25 logical activities or biological activities as a protein without mutation. Examples thereof include one or more of muta- 
tions, for instance, one or several mutations, more specifically one to about ten mutations (such as deletion, insertion, 
addition and substitution) and the like. 

The same can be said for the resulting proteins in the case where the above mutation is artificially introduced into 
the amino acid sequence of a protein. In this case, more diverse mutants can be prepared. For example, although the 

30 methionine residue at the N-terminus of a protein expressed in Escherichia coli is reportedly often removed by the 
action of methionine aminopeptidase, since the methionine residue is not removed perfectly depending on the kinds of 
proteins, those having methionine residue and those without methionine residue can be both produced. However, the 
presence or absence of the methionine residue does not affect protein activity in most cases. It is also known that a 
polypeptide resulting from substitution of a particular cysteine residue with serine in the amino acid sequence of human 

35 interleukin 2 (IL-2) retains IL-2 activity [Science, 224, 1431 (1984)]. 

In addition, during the production of a protein by genetic engineering, the desired protein is often expressed as a 
fusion protein. For example, purification of the desired protein is facilitated by adding the N-terminal peptide chain 
derived from another protein to the N-terminus of the desired protein to increase the amount of expression of the 
desired protein, or by adding an appropriate peptide chain to the N- or C-terminus of the desired protein, expressing the 

40 protein, and using a carrier having affinity for the peptide chain added. Accordingly, a DNA polymerase having an amino 
acid sequence which has a partial difference with that of the DNA polymerase of the present invention is within the 
scope of the present invention as "a functional equivalent," as long as it exhibits substantially the same activities as the 
DNA polymerase of the present invention. 

45 (2) Gene of DNA Polymerase of Present Invention 

The DNA encoding the first DNA polymerase-constituting protein which constitutes the DNA polymerase of the 
present invention includes a DNA comprising an entire sequence of the base sequence encoding the amino acid 
sequence as shown by SEQ ID NO:1 in Sequence Listing or a partial sequence thereof including, for instance, a DNA 

so comprising an entire sequence of the base sequence as shown by SEQ ID NO:2 or a partial sequence thereof. Specif- 
ically, a DNA comprising a partial sequence of the base sequence encoding the amino acid sequence as shown by SEQ 
ID NO:1 including, for instance, the DNA comprising a partial sequence of the base sequence as shown by SEQ ID 
NO:2 in Sequence Listing, the base sequence encoding a protein possessing a function of the first DNA polymerase- 
constituting protein is also included in the scope of the present invention. Also, in the amino acid sequence as shown 

55 by SEQ ID NO:1 , the above DNA also includes a DNA encoding a protein comprising an amino acid sequence resulting 
from deletion, insertion, addition, substitution and the like of one or several amino acids, the protein possessing a func- 
tion of the first DNA polymerase-constituting protein. Furthermore, a base sequence capable of hybridizing to the above 
base sequences under the stringent conditions, the base sequence encoding a protein possessing a function of the first 
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DNA polymerase-constituting protein, is also included in the scope of the present invention. In addition, the DNA encod- 
ing the second DNA polymerase-constituting protein which constitutes the DNA polymerase of the present invention 
includes a DNA comprising an entire sequence of the base sequence encoding the amino acid sequence as shown by 
SEQ ID NO:3 in Sequence Listing or a partial sequence thereof including, for instance, a DNA comprising an entire 

5 sequence of the base sequence as shown by SEQ ID NO:4 in Sequence Listing or a partial sequence thereof. Specif- 
ically, the DNA comprising a partial sequence of the base sequence encoding the amino acid sequence as shown by 
SEQ ID NO:3, for instance, the DNA comprising a partial sequence of the base sequence as shown by SEQ ID NO:4 
in Sequence Listing, the base sequence encoding a protein possessing a function of the second DNA polymerase-con- 
stituting protein, is also included in the scope of the present invention. Also, in the amino acid sequence as shown by 

10 SEQ ID NO:3, the above DNA also includes a DNA encoding a protein comprising an amino acid sequence resulting 
from deletion, insertion, addition, substitution and the like of one or several amino acids, the protein possessing a func- 
tion of the second DNA polymerase-constrtuting protein. Furthermore, a base sequence capable of hybridizing to the 
above base sequences under the stringent conditions, the base sequence encoding a protein possessing a function of 
the second DNA polymerase-constituting protein, is also included in the scope of the present invention. 

is The term "protein possessing a function of the first DNA polymerase-constituting protein" or "protein possessing a 
function of the second DNA polymerase-constituting protein" herein refers to a protein possessing properties exhibiting 
a DNA polymerase activity with various physicochemical properties shown in the above items 1) to 6). 

Here, the term "capable of hybridizing under the stringent conditions" refer to hybridizing to a probe, after incubat- 
ing at 50°C for 12 to 20 hours in 6 x SSC (wherein 1 x SSC shows 0.15 M NaCI, 0.015 M sodium citrate, pH 7.0) con- 

20 taining 0.5% SDS, 0.1% bovine serum albumin (BSA), 0.1% polyvinyl pyrrolidone, 0.1% Ficol 400, and 0.01% 
denatured salmon sperm DNA with the probe. 

The term "DNA containing a base sequence encoding an amino acid sequence" described in the present specifi- 
cation will be explained. One to six kinds are known to exist for each amino acid with regards to a codon (triplet base 
combination) for designating a particular amino acid on the gene. Therefore, there can be a large number of DNA 

25 encoding an amino acid sequence, though depending on the amino acid sequence. In nature, genes do not always exist 
in stable forms, and it is not rare for genes to undergo mutations on a base sequence. There may be a case where 
mutations on the base sequence do not give rise to any changes in an amino acid sequence to be encoded (referred to 
as silent mutation). In this case, it can be said that different kinds of genes encoding the same amino acid sequence 
have been generated. The possibility, therefore, cannot be negated for producing a variety of genes encoding the same 

30 amino acid sequence after many generations of the organism even when a gene encoding a particular amino acid 
sequence is isolated. 

Moreover, it is not difficult to artificially produce a variety of genes encoding the same amino acid sequence by 
means of various genetic engineering techniques. For example, when a codon used in the natural gene encoding the 
desired protein is used at a low frequency in the host in the production of the protein by genetic engineering, the amount 

35 of a protein expressed is sometimes low. In this case, high expression of the desired protein is achieved by artificially 
converting the codon into another one used at a high frequency in the host without changing the amino acid sequence 
encoded (for instance, Japanese Patent Laid-Open No. Hei 7-102146). As described above, it is, of course, possible to 
artificially produce a variety of genes encoding a particular amino acid sequence. Such artificially produced different 
polynucleotides are, therefore, included in the scope of the present invention, as long as the gene encodes the amino 

40 acid sequence disclosed in the present invention. 

(3) Method for Producing DNA Polymerase of Present Invention 

The present inventors have found genes of a novel DNA polymerase from a hyperthermophilic archaebacterium, 
45 Pyrococcus furiosus, and cloned to clarify that the genes encode a novel DNA polymerase exhibiting its activity by the 
coexistence of two kinds of proteins on the genes. In the present invention, the DNA polymerase of the present inven- 
tion can be mass-produced by preparing transformants incorporating the above genes. For this purpose, the transform- 
ant may be prepared by a process comprising culturing a transformant containing both the gene encoding the first DNA 
polymerase-constrtuting protein and the gene encoding the second DNA polymerase-constituting protein, and collect- 
so ing the DNA polymerase from the resulting culture. Alternatively, the transformant may be prepared by a process com- 
prising separately culturing a transformant containing the gene encoding the first DNA polymerase-constituting protein 
and a transformant containing the gene encoding the second DNA polymerase-constrtuting protein, mixing the DNA 
polymerase-constrtuting proteins contained in the resulting culture, and collecting the DNA polymerase therefrom. 
Here.-the phrase "transformant containing both the gene encoding the first DNA polymerase-constituting protein 
55 and the gene encoding the second DNA polymerase-constituting protein" may be a transformant resulting from co- 
transformation with two expression vectors containing the respective genes, or it may be a transformant prepared by 
recombining both genes into one expression vector to allow the respective proteins to be expressed. 
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(4) A cloning of the genes of the DNA polymerase of the present invention, an analysis of obtained clones, physico- 
chemical properties, activities, applicabilities to PCR method of expression product DNA polymerase, and the like are 
hereinafter described in detail. 

5 The strain used for the present invention is not subject to particular limitation. Examples thereof include Pyrococcus 
furiosus DSM3638, as a strain belonging to the genus Pyrococcus. The above strain can be made available from Deut- 
sche Sammlung von Mikroorganismen und Zelikulturen GmbH. In the case of cutturing the above strain in an appropri- 
ate growth culture, preparing a crude extract from the resulting culture, and subjecting the crude extract to a 
polyacrytamide gel electrophoresis, since the present inventors found existences of several kinds of protein bands 

10 showing DNA polymerase activity in the gel, it has been anticipated that the genes corresponding to these respective 
bands have existed. Specifically, the novel DNA polymerase gene and the product thereof can be cloned by the proce- 
dures exemplified below. 

1) DNA is extracted from Pyrococcus furiosus; 
is 2) The DNA obtained in 1) is digested with an appropriate restriction endonuclease, to prepare a DNA library with 
a plasmid, cosmid and the like, as a vector; 

3) The library prepared in 2) is introduced into Escherichia coli t and a foreign gene is expressed to prepare a pro- 
tein library in which crude extracts of the resulting clones are collected; 

4) A DNA polymerase activity is assayed by using the protein library prepared in 3), and a foreign DNA is taken out 
20 from the Escherichia coli clone which provides a crude extract having an activity; 

5) The Pyrococcus furiosus DNA fragment contained in the plasmid or cosmid taken out is analyzed to narrow 
down the gene region encoding a protein exhibiting a DNA polymerase activity; 

6) The base sequence of the region in which the protein exhibiting a DNA polymerase activity is presumably 
encoded is determined to deduce the primary structure of the protein; and 

25 7) An expression plasmid is constructed to take a form which more easily allows the expression of the protein 
deduced in 6) in Escherichia coli, and the produced protein is purified and analyzed for the properties thereof. 

The above DNA donor, Pyrococcus furiosus DSM3638, is a hyperthermophilic archaebacterium, which is cultured 
at 95°C under anaerobic conditions. Known methods can be used as a method for disrupting grown cells followed by 

30 extracting and purifying DNA, a method for digesting the obtained DNA with a restriction endonuclease and for other 
methods. Such methods are described in detail by in Molecular Cloning: A Laboratory Manual, 75-178, published by 
Cold Spring Harbor Laboratory in 1982, edited by T. Maniatis et al. 

In the preparation of a DNA library, the triple helix cosmid vector (manufactured by Stratagene), for example, can 
be used. The DNA of Pyrococcus furiosus is partially digested with SauZA\ (manufactured by Takara Shuzo Co., Ltd.), 

35 and the digested DNA is subjected to density gradient certtrifugation to obtain the long DNA fragments. They are ligated 
to the BamH\ site of the above vector, followed by in vitro packaging. The respective transformants obtained from the 
DNA library thus prepared are separately cultured. After harvesting, cells are disrupted by ultrasonication, and the 
resulting disruption is heat-treated to inactivate the DNA polymerase from the host Escherichia coli. Thereafter, a 
supernatant containing a thermostable protein can be obtained by centrifugation. The above supernatant is named as 

40 a cosmid protein library. By means of assaying the DNA polymerase activity using a portion of the supernatant, a clone 
that expresses the DNA polymerase derived from Pyrococcus furiosus can be obtained. DNA polymerase activity can 
be assayed using the known method described in DNA Polymerase from Escherichia coli. published by Harpar and 
Row, edited by D.R. Davis, 263-276 (authored by C.C. Richardson). 

One of the DNA polymerase genes of Pyrococcus furiosus has already been cloned and its structure clarified by 

45 the present inventors, as described in Nucleic Acids Research, 21, 259-265 (1993). The translation product of the 
above gene is a polypeptide having a molecular weight of about 90,000 Da and consisting of 775 amino acids, and the 
amino acid sequence thereof clearly contains preserved sequences of the a-type DNA polymerases. In fact, since the 
DNA polymerase activity exhibited by this gene product is inhibited by aphidicolin, which is a specific inhibitor of a-type 
DNA polymerases, the above DNA polymerase is distinguishable from the DNA polymerase of the present invention. 

so Therefore, the above known gene out of the obtained clones exhibiting thermostable DNA polymerase activity can be 
removed by a process comprising digesting the cosmid contained in each clone, carrying out hybridization with the 
above gene as a probe, and selecting an unhybridizing clone. A restriction endonuclease map of the DNA insert can be 
prepared for the cosmid digested with the resulting clone containing the novel DNA polymerase gene. Next, a location 
of the DNA polymerase gene on the above DNA fragment can be determined by a process comprising dividing the 

55 above DNA fragment into various regions on the basis of the obtained restriction endonuclease map, subcloning each 
region into a plasmid vector, introducing the resulting vector into Escherichia coli, and assaying the thermostable DNA 
polymerase activity exhibited therein. An Xba\-Xba\ DNA fragment of about 10 kbp containing the DNA polymerase 
• gene can be thus obtained. 
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The recombinant Escherichia coli harboring a plasmid incorporating the above DNA fragment exhibits a sufficient 
level of a DNA synthesis activity in the crude extract thereof even after treatment at 90°C for 20 minutes, while such an 
. activity is not found in any plasmids without incorporating a DNA fragment. Therefore, it can be concluded that the infor- 
mation for producing a thermostable polymerase is present on the DNA fragment, and that a gene having the above 

5 information is expressed in the above Escherichia coli. The plasmid resulting from recombination of the DNA fragment 
into a pTV1 1 8N vector (manufactured by Takara Shuzo Co., Ltd.) is named as pFU1 001 . The Escherichia coli JM1 09 
transformed with the above plasmid is named and identified as Escherichia coli JM109/£FU1001, has been deposited 
under accession number FERM BP-5579 with the National Institute of Bioscience and Human-Technology, Agency of 
Industrial Science and Technology, Ministry of International Trade and Industry, of which the address is 1-3, Higashi 1- 

w chome, Tsukuba-shi, Ibaraki-ken, 305, Japan, since August 11, 1995 (date of original deposit) under the Budapest 
Treaty. 

The base sequence of the DNA fragment inserted in the plasmid pFU1001 can be determined by a conventional 
method, for instance, by the dideoxy method. Furthermore, regions capable of encoding a protein in the base 
sequence, i.e., open reading frames (ORFs), can be deduced by analyzing the resulting base sequence. 

is An 8,450 bp sequence in the base sequence of the Xba\-Xba\ DNA fragment of about 10 kbp inserted in the plas- 
mid pFU1001 is shown by SEQ ID NO:5 in Sequence Listing. In the base sequence, there are six consecutive ORFs, 
named as ORF1 , ORF2, ORF3, ORF4, ORFS, and ORF6, respectively, naming from the 5' terminaLside. FIG. 2 shows 
the restriction endonuclease map of the above Xba\-Xba\ fragment and the location of the ORFs on the fragment 
(ORF1 to ORF6, from the left in the Figure). 

20 A sequence showing homologies to any of known DNA polymerases was not found in any one of the above six 
ORFs. It should be noted, however, that on ORF1 and ORF2, there is a sequence homologous to the CDC6 protein 
found in Saccharomyces cerevisiae, or a sequence homologous to the CDC 18 protein found in Schizosaccharomyces 
pombe. The CDC6 and the CDC1 8 are anticipated as proteins that are necessary for the cell cycle shift to the DNA syn- 
thesis phase (S phase) in yeasts, the proteins regulating initiation of the DNA replication. Also, the ORF6 has a 

25 sequence homologous to the RAD51 protein, known to act in DNA damage repair in yeasts and recombination in the 
somatic mitosis phase and in the meiosis phase in yeasts, and a sequence homologous to the Drnd protein, a meiosis 
phase-specific homolog to the RAD51 protein. The gene encoding the RAD51 protein is also known to be expressed at 
the cell cycle shift from the G1 to S phase. For the other ORFs, namely ORF3, ORF4, and ORFS, there have been no 
known proteins found to have a homologous sequence. 

30 It is possible to determine from which of the above ORFs the thermostable DNA polymerase activity is derived by 
a process comprising preparing recombinant plasmids inserted with the respective DNA fragments deleting various 
regions, transforming a host with the plasmids, and assaying the thermostable polymerase activity of each transformant 
obtained. The transformant resulting from transformation with a recombinant plasmid inserted with a DNA fragment pre- 
pared by deleting ORF1 or ORF2, or deleting ORF5 or ORF6, from the above Xba\-Xba\ DNA fragment of about 10 kbp 

35 retains the thermostable DNA polymerase activity, while those resulting from transformation with a recombinant plasmid 
inserted with a DNA fragment prepared by deleting ORF3 or ORF4 loses its activity. This fact predicts that the DNA 
polymerase activity is encoded by ORF3 or ORF4. 

It is possible to determine by which of ORF3 and ORF4 the DNA polymerase is encoded by a process comprising 
preparing recombinant plasmids separately inserted with the respective ORFs, transforming a host with each recom- 

40 binant plasmid, and assaying exhibition of a thermostable DNA polymerase activity in each transformant obtained. 
Unexpectedly, only very weak DNA polymerase activity is detected in a crude extract obtained from the transformant 
containing ORF3 or ORF4 alone. However, since a similar level of a thermostable DNA polymerase activity to that in 
the transformant containing both ORF3 and ORF4 can be obtained in the case where the two extracts are mixed, it is 
shown that the novel DNA polymerase of the present invention requires the actions of the translation products of the 

45 two ORFs. It is possible to find out whether the two proteins form a complex to exhibit the DNA polymerase activity, or 
one modifies the other to convert it to an active enzyme by determining the molecular weight of the DNA polymerase. 
. The results of the determination of the molecular weight of the above DNA polymerase by gel filtration method demon- 
strate that the above two proteins form a complex. 

The base sequence of ORF3 is shown by SEQ ID NO:2 in Sequence Listing, and the amino acid sequence of the 

so ORF3-derived translation product, namely the first DNA polymerase-constituting protein as deduced from the base 
sequence, is shown by SEQ ID NO:1 . The base sequence of ORF4 is shown by SEQ ID NO:4 in Sequence Listing, and 
the amino acid sequence of the ORF4-derived translation product, namely the second DNA polymerase-constituting 
protein as deduced from the base sequence, is shown by SEQ ID NO:3. 

The DNA polymerase of the present invention can be expressed in cells by culturing a transformant resulting from 

55 transformation with a recombinant plasmid into which both ORF3 and ORF4 are introduced, for instance, Escherichia 
coli JM109/pFU1001, under usual culturing conditions, for instance, culturing in an LB medium (10 g/l trypton, 5 g/l 
yeast extract, S g/l NaCI, pH 7.2) containing 100 ug/ml ampicillin. The above polymerase can be purified from the above 
cultured cells to the extent that only the two kinds of bands of nearly two kinds of the DNA polymerase-constituting pro- 
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teins are obtained in SDS-polyacrytamide gel electrophoresis (SDS-PAGE), by carrying out ultrasonication, heat treat- 
ment, and chromatography using an anionic exchange column (RESOURCE Q column, manufactured by Pharmacia), 
a heparin Sepharose column (HiTrap Heparin, manufactured by Pharmacia), a gel filtration column (Superose 6HR, 
manufactured by Pharmacia) or the like. It is also possible to obtain the desired DNA polymerase by a process compris- 

5 ing separately culturing transformants respectively containing ORF3 or ORF4 alone as described above, and subse- 
quently mixing the cultured cells obtained, their crude extracts, or purified DNA polymerase-constituting proteins. When 
mixing the two kinds of DNA polymerase-constituting proteins, special procedures are not required, and the DNA 
polymerase possessing an activity can be obtained simply by mixing the extracts from the respective transformants or 
the two proteins purified therefrom in appropriate amounts. 

10 The DNA polymerase of the present invention thus obtained provides two bands at positions corresponding to 
molecular weights of about 90,000 Da and about 140,000 Da on the SDS-PAGE, and these two bands corresponding 
to the first and second DNA polymerase-constituting proteins, respectively. 

As shown in FIG. 3, the DNA polymerase of the present invention exhibits the optimum pH is in the neighborhood 
of 6.5 to 7.0 at 75°C in a potassium phosphate buffer. When an enzyme activity of the above DNA polymerase is 

is assayed at various temperatures, the enzyme exhibits a high activity at 75° to 80°C. However, because the double 
stranded structure of the activated DNA used as a substrate for activity assay is destructed at higher temperatures, an 
accurate optimum temperature for the activity of the above enzyme has not been assayed. The above DNA polymerase 
possesses a high heat stability, retaining not less than 80% of the remaining activity even after a heat treatment at 80°C 
for 30 minutes, as shown in FIG. 4. This level of the heat stability allows the use of the above enzyme for PCR method. 

20 Also, when assessing the influence of aphid icolin, a specific inhibitor of a-type DNA polymerases, it is demonstrated 
that the activity of the above DNA polymerase is not inhibited even in the presence of 2 mM aphidicolin. 

As a result of analyzing the biochemical properties of the purified DNA polymerase, the DNA polymerase of the 
present invention possesses very excellent primer extension activity in vitro. As shown in Table 1, in the case where 
DNA polymerase activity is assayed using a substrate in a form resulting from primer annealing to a single stranded 
. 25 DNA (the M13-HT Primer), higher nucleotide incorporating activity as compared to that of the activated DNA used for 
usual activity assaying (DNase l-treated calf thymus DNA) can be demonstrated. When the primer extension ability of 
the DNA polymerase of the present invention is compared with that of other DNA polymerases using the above M13- 
HT Primer substrate, the DNA polymerase of the present invention exhibits superior extension activity as compared to 
known DNA polymerases derived from Pyrococcus furiosus (Pfu DNA polymerase, manufactured by Stratagene) and 

30 Taq DNA polymerase derived from Thermus aquaticus (TaKaRa Taq, manufactured by Takara Shuzo Co., Ltd.)- Fur- . 
thermore, when an activated DNA is added to this reaction system as a competitor substrate, the primer extension 
activities of the above two kinds of DNA polymerases are strongly inhibited, while that of the DNA polymerase of the 
present invention is inhibited at a low level, demonstrating that the DNA polymerase of the present invention possesses 
a high affinity for substrates of the primer extension type (FIG. 6). 

35 



Table 1 



Substrates 


Relative Activity 




DNA Polymerase of the 
Present Invention 


Pfu DNA Polymerase 


Tag DNA Polymerase 


activated DNA 


100 


100 


100 


thermal-denatured DNA 


340 


87 


130 


M13-HT primer 


170 


23 


90 


M13-RNA primer 


52 


0.49 


38 


poly dA-Oligo dT 


94 


390 


290 


poly A-Oligo dT 


0.085 




0.063 



Also, the DNA polymerase of the present invention shows excellent performance when used for the PCR method. 
In the DNA polymerase derived from Thermus aquaticus, commonly used for the PCR method, it is difficult to amplify 
a DNA fragment of not less than 10 kbp using, the above DNA polymerase alone, and a DNA fragment of not less than 
55 20 kbp can be amplified when used in combination with another DNA polymerase [Proceedings of the National Acad- 
emy of Sciences of the USA, 91, 2216-2220 (1994)]. Also, the strand length of DNA amplifiable using the Pfu DNA 
polymerase is reportedly at most about 3 kbp. By contrast, when using the DNA polymerase of the present invention, 
the amplification of a DNA fragment of 20 kbp in length is made possible even when used alone without addition of any 
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other enzymes. 

Moreover, the DNA polymerase of the present invention which also has associated 3'->5' exonudease activity is 
comparable to the Pfu DNA polymerase, known to ensure very high accuracy in DNA synthesis, owing to its high activity 
in terms of the ratio of the exonudease activity to the DNA polymerase activity (FIG. 5). Also, the error rate during the 

s DNA synthesis reaction is lower for the DNA polymerase of the present invention than that of the Taq DNA polymerase. 
The various properties demonstrate that the DNA polymerase of the present invention serves very excellently as a rea- 
gent for genetic engineering techniques such as the PCR method. 

The finding of the novel DNA polymerase genes according to the present invention also provides an interesting 
suggestion as follows. In order to determine the manner in which the region containing the genes for ORF3 and ORF4 

10 encoding a novel DNA polymerase is intracellular^ transcribed, the present inventors have analyzed an RNA fraction 
prepared from Pyrococcus furiosus cells by northern blotting method, RT-PCR method and primer extension method. 
As a result, it is confirmed that ORF1 to ORF6 are transcribed from immediately upstream of ORF1 as a single mes- 
senger RNA (mRNA). From the above finding, there is an expectation that the production of the ORF1 and the ORF2 
in cells is subjected to the same control as that for the ORF3 and the ORF4. When considering in combination with the 

is sequence homologies of ORF1, ORF2, ORF5, and ORF6 to those of CDC6 and CDC18, the CDC6 and the CDC18 
being involved in the regulation for initiation of the DNA replication in yeasts, the above expectation suggests that the 
novel DNA polymerase of the present invention is highly likely to be a DNA polymerase important for the DNA replica- 
tion. Since it is also expected that the DNA replication system of archaebacteria, to which group Pyrococcus furiosus 
belongs, is closely related to that of eukaryotic cells, there is a possibility of the presence of an enzyme similar to the 

20 DNA polymerase of the present invention as a DNA polymerase important for replication that has not been found in 
eukaryotes. 

It is also expected that thermostable DNA polymerases similar to the DNA polymerase of the present invention are 
produced in other bacteria belonging to hyperthermophilic archaebacteria like Pyrococcus furiosus, including, for 
instance, bacteria other than Pyrococcus furiosus belonging to the genus Pyrococcus: bacteria belonging to the genus 
25 Pyrodictium\ the genus Thermococcus, the genus Staphylothermus, and other genera. When these enzymes are con- 
stituted by two DNA polymerase-constituting proteins, like the DNA polymerase of the present invention, it is expected 
that a similar DNA polymerase activity is exhibited by combining one of the two DNA polymerase-constituting proteins 
and the DNA polymerase-constituting protein of the present invention corresponding to the other DNA polymerase-con- 
stituting protein. 

30 The thermostable DNA polymerases similar to the DNA polymerase of the present invention, produced by the 
above hyperthermophilic archaebacteria, are expected to have homology to the DNA polymerase of the present inven- 
tion in terms of its amino acid sequence and the base sequence of the gene encoding thereof. It is therefore possible 
to obtain the gene for a thermostable DNA polymerase similar to the DNA polymerase of the present invention of which 
the base sequence is not identical to that of the DNA polymerase of the present invention but possesses similar enzyme. 

35 activities by a process comprising introducing into an appropriate microorganism a DNA fragment obtained from one of 
the above thermophilic archaebacteria by hybridization using, as a probe,' a gene isolated by the present invention or a 
portion of the above base sequence, and assaying the DNA polymerase activity in a heat-treated lysate prepared in the 
same manner as the above cosmid protein library by an appropriate method. 

The above hybridization can be carried out under the following conditions. Specifically, a DNA-immobilized mem- 

40 brane is incubated with a probe at 50°C for 12 to 20 hours in 6 x SSC, wherein 1 x SSC indicates 0.15 M NaCI, 0.015 
M sodium citrate, pH 7.0, containing 0.5% SDS, 0.1% bovine serum albumin, 0.1% polyvinyl pyrrolidone, 0.1% Ficol 
400, and 0.01% denatured salmon sperm DNA. After termination of the incubation, the membrane is washed, initiating 
at 37°C in 2 x SSC containing 0.5% SDS, and changing the SSC concentration to 0.1 x SSC from the starting level, 
while varying the SSC temperature to 50°C until the signal from the immobilized DNA becomes distinguishable from the 

45 background. 

Thus, it is possible to obtain a gene for a thermostable DNA polymerase similar to the DNA polymerase of the 
present invention of which the DNA polymerase activity is not identical but of the same level as that of the DNA polymer- 
ase of the present invention, by introducing into an appropriate microorganism a DNA fragment obtained by a gene 
amplification reaction using, as a primer, a gene isolated by the present invention or a portion of the base sequence of 
so the gene, with a DNA obtained from one of the above thermophilic archaebacteria as a template, or a DNA fragment 
resulting from the thermophilic archaebacterium by hybridization with the fragment obtained by a gene amplification 
reaction as a probe, and assaying the DNA polymerase activity in the same manner as above. 

The present invention is hereinafter described by means of the following examples, but the scope of the present 
invention is not limited only to those examples. The % values shown in Examples below mean % by weight 

55 
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Example 1 

(1) Preparation of Pvrococcus furiosus Genomic DNA 

s Pyrococcus furiosus DSM3638 was cultured in the following manner: 

A medium having a composition comprising 1% trypton, 0.5% yeast extract, 1% soluble starch, 3.5% Jamarin S 
Solid (Jamarin Laboratory), 0.5% Jamarin S Liquid (Jamarin Laboratory), 0.003% MgS0 4 , 0.001% NaCI, 0.0001% 
FeS0 4 -7H 2 0, 0.0001% CoS0 4 . 0.0001% CaCI 2 -7H 2 0, 0.0001% ZnS0 4( 0.1 ppm CuS0 4 *5H 2 0, 0.1 ppm 

10 KAI(S0 4 ) 2 , 0.1 ppm H 3 B0 3l 0.1 ppm Na 2 Mo0 4 • 2H 2 0, and 0.25 ppm NiCI 2 * 6H 2 0 was placed in a two-liter medium 
bottle and sterilized at 120°C for 20 minutes. After removal of dissolved oxygen by sparging with nitrogen gas thereinto, 
the above strain was inoculated into the resulting medium. Thereafter, the medium was cultured by kept standing at 
95°C for 16 hours. After termination of the cultivation, cells were harvested by centrifugation. 

The harvested cells were then suspended in 4 ml of 0.05 M Tris-HCI (pH 8.0) containing 25% sucrose. To this sus- 

15 pension, 0.8 ml of lysozyme [5 mg/ml, 0.25 M Tris-HCI (pH 8.0)] and 2 ml of 0.2 M EDTA were added and incubated at 
20°C for 1 hour. After adding 24 mj of an SET solution [150 mM NaCI, 1 mM EDTA, and 20 mM Tris-HCI (pH 8.0)], 4 ml 
of 5% SDS and 400 \i\ of proteinase K (10 mg/ml) were added to the resulting mixture. Thereafter, the resulting mixture 
was reacted at 37°C for 1 hour. After termination of the reaction, phenol-chloroform extraction and subsequent ethanol 
precipitation were carried out to prepare about 3.2 mg of genomic DNA. 

20 

(2) Preparation of Cosmid Protein Library 

Four hundred micrograms of the genomic DNA from Pyrococcus furiosus DSM3638 was partially digested with 
Sau3A1 and fractionated by size into 35 to 50 Kb fractions by density gradient ultracentrrfugation method. One micro- 

25 gram of the triple helix cosmid vector (manufactured by Stratagene) was digested with Xba\, dephosphorylated using 
an alkaline phosphatase (manufactured by Takara Shuzo Co., Ltd.), and further digested with BamHi. The resulting 
treated vector was subjected to ligation after mixing with 140 ug of the above 35 to 50 kb DNA fractions. The genomic 
DNA fragment from Pyrococcus furiosus was packaged into lambda phage particles by in vitro packaging method using 
"GIGAPACK GOLD" (manufactured by Stratagene), to prepare a library. A portion of the obtained library was then trans- 

30 duced into £ coli DHSaMCR. Severaltransformants out of the resulting transformants were selected to prepare a cos- 
mid DNA. After confirmation of the presence of an insert of appropriate size, about 500 transformants were again 
selected from the above library, and each was separately cultured in 150 ml of an LB medium (10 g/l trypton, 5 g/l yeast 
extract, 5 g/l NaCI, pH 7.2) containing 100 jig/ml ampicillin. The resulting culture was centrifuged, and the harvested 
cells were suspended in 1 ml of 20 mM Tris-HCI at a pH of 8.0, and the resulting suspension was then heat-treated at 

35 1 00°C for 1 0 minutes. Next, uftrasonication was carried out, and a heat treatment was carried out again at 1 00°C for 1 0 
minutes. The lysate obtained as a supernatant after centrifugation was used as a cosmid protein library. 

(3) Assay of DNA Polymerase Activity 

40 The DNA polymerase activity was assayed using calf thymus DNA (manufactured by Worthington) activated by 
DNase I treatment (activated DNA) as a substrate. DNA activation and assay of DNA polymerase activity were carried 
out by the method described in DNA Polymerase from Escherichia coli. 263-276 (authored by C.C. Richardson), pub- 
lished by Harper & Row, edited by D.R. Davis. 

An assay of enzyme activity was carried out by the following method. Specifically, 50 nl of a reaction solution [20 

45 mM Tris-HCI (pH 7.7), 15 mM MgCI 2 , 2 mM 2-mercaptoethanol, 0.2 mg/ml activated DNA, 40 nM each of dATP, dCTP, 
dGTP and dTTP, 60 nM [ 3 H]-dTTP (manufactured by Amersham)], containing a sample for assaying its activity, was pre- 
pared and reacted at 75°C for 15 minutes. A 40 ul portion of this reaction mixture was then spotted onto a DE paper 
(manufactured by Whatman) and washed with 5% Na 2 HP0 4 five times. The remaining radioactivity on the DE paper 
was assayed using a liquid scintillation counter. The amount of enzyme which incorporated 10 nmol of [ 3 H]-dTMP per 

so 30 minutes into the substrate DNA, assayed by the above-described enzyme activity assay method, was defined as one 
unit of the enzyme. 

(4) Selection of Cosmid Clones Containing DNA Polymerase Gene 

ss A reaction mixture comprising 20 mM Tris-HCI (pH 7.7), 2 mM MgCI 2t 2 mM 2-mercaptoethanol, 0.2 mg/ml acti- 
vated DNA, 40 uM each of dATP, dCTP, dGTP and dTTP, 60 nM [ 3 H]-dTTP (manufactured by Amersham) was prepared. 
One (J of 5 clones each of the respective extracts from the cosmid protein library, namely 5 ^ of extracts as for one 
reaction, was added to 45 of this mixture. After the mixture was reacted at 75°C for 15 minutes, a 40 uJ portion of 



11 



EP 0 870 832 A1 

each reaction mixture was spotted onto a DE paper and washed with 5% Na 2 HP0 4 five times. The remaining radioac- 
tivity on the DE paper was assayed using a liquid scintillation counter. A group found to have some activities by primary 
assay, wherein one group consisted of 5 clones, was separated into one clone each from the 5 clones, and then sec- 
ondary assay was carried out for each clone. Since it had been already known that the cosmid DNA library included 
5 clones containing a known DNA polymerase gene by a hybridization test with the gene as a probe, designated as Clone 
Nos. 57, 154, 162, and 363, 5 clones possessing DNA synthesis activity other than those clones were found as Clone 
Nos. 41, 153. 264, 462, and 491. 

(5) Preparation of Restriction Endonuclease Mao 

JO 

Cosmids were isolated from the above 5 clones, and each cosmid was digested with Bam HL When examining the 
resulting. migration patterns, there were demonstrated several mutually common bands, predicting that those 5 clones 
recombine regions with overlaps and slight shifts. With this finding in mind, the DNA inserts in Clone Nos. 264 and 491 
were treated to prepare the restriction endonuclease map. The cosmids prepared from both clones were digested with 
is various restriction endonucleases. As a result of determination for respective cleavage sites of Kpn\, Not I, Pst\, $ma\ t 
Xba\, and Xho\ (all manufactured by Takara Shuzo Co., Ltd.), digested into fragments of appropriate sizes, a map as 
shown in FIG. 1 was obtained. 

(6) Subclonina of DNA Polymerase Gene 

' 20 

On the basis of the restriction endonuclease map as shown in FIG. 1, various DNA fragments of about 10 kbp in 
length were cut out from the cosmid derived from clone No. 264 or 491. The fragments were then subcloned into the 
pTV118N or pTV119N vector (manufactured by Takara Shuzo Co., Ltd.). The resulting transformant with each of the 
recombinant plasmids was then subjected to assaying of the thermostable DNA polymerase activity, to demonstrate 
25 that a gene for production of a highly thermostable DNA polymerase was present an Xba\~Xba\ fragment of about 10 
kbp A plasmid resulting from recombination of the Xba\-Xba\ fragment in the pTV1 1 8N vector was then named as plas- 
mid pFU1001, and. the Escherichia coil JM109 transformed with the plasmid was named as Escherichia coli 
JM109/£FII1001. 

30 Example 2 

Determination of Base Sequence of DNA Fragment Containing Novel DNA Polymerase Gene 

The above Xba\-Xba\ fragment, containing the DNA polymerase gene, was again cut out from the plasmid 
35 pFU1001 obtained in Example 1 with Xba\, and blunt-ended using a DNA blunting kit (manufactured by Takara Shuzo 
Co., Ltd.). The resultant was then ligated to the new pTV1 18N vector, previously linearized with Sma\, in different ori- 
entations to yield plasmids for preparing deletion mutants. The resulting plasmids were named as pFU1002 and 
pFU1003, respectively. Deletion mutants were sequentially prepared from both ends of the DNA insert using these 
plasmids. The Kilo-Sequence deletion kit (manufactured by Takara Shuzo Co., Ltd.) applying Henikoffs method (Gene, 
40 28, 351-359) was used for the above preparation. The 3'-overhanging type and S'-overhanging type restriction endonu- 
cleases used were Pst\ and Xba\, respectively. The base sequence of the insert was determined by the dideoxy method 
using the BcaBEST dideoxy sequencing kit (manufactured by Takara Shuzo Co., Ltd.) with the various deletion mutants 
as templates. 

An 8,450 bp sequence in the base sequence determined is shown by SEQ ID NO:5 in Sequence Listing. As a result 
45 of analysis of the base sequence, there were revealed six open reading frames (ORFs) capable of encoding proteins, 
present at positions corresponding to Base Nos. 123-614 (ORF1), 611-1381 (ORF2), 1384-3222 (ORF3), 3225-7013 
(ORF4), 7068-7697 (ORFS), and 7711-8385 (ORF6) in the base sequence as shown by SEQ ID NO:5 in Sequence 
Listing. The restriction endonuclease map of the about 10 kbp Xba\-Xba\ DNA fragment recombined in the plasmid 
pFU1001 and the location of the above-mentioned ORFs thereon are shown in FIG. 2. 
so In addition, the thermostable DNA polymerase activity was assayed using the above various deletion mutants. The 
results demonstrated that the DNA polymerase activity is lost when the deletion involves the ORF3 and ORF4 regions, 
regardless of whether the deletion started from upstream or downstream. This finding demonstrated that the translation 
products of the ORF3 and the ORF4 were important in the exhibition of the DNA polymerase activity. The base 
sequence of the ORF3 is shown by SEQ ID No:2 in Sequence Listing, and the amino acid sequence of the translation 
55 product of the ORF3 as deduced from the base sequence is SEQ ID NO:1 in Sequence Listing, respectively. Also, the 
base sequence of OR F4 is shown by SEQ ID NO:4 in Sequence Listing, and the amino acid sequence of the translation 
product of ORF4 as deduced from the base sequence is SEQ ID NO:3 in Sequence Listing, respectively. 
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Example 3 

Preparation of Purified DNA Polymerase Standard Preparation 

5 The Escherichia co// JM109/pFU1001 obtained in Example 1 was cultured in 500 ml of an LB medium (10 g/l tryp- 
ton, 5 g/l yeast extract, 5 g/l NaCI, pH 7.2) containing ampicillin at a concentration of 100 ug/ml. When the culture broth 
turbidity reached 0.6 in A^o, an inducer, isopropyl-p-D-thiogalactoside (IPTG) was added and cultured for 16 hours. 
After harvesting, the harvested cells were suspended in 37 ml of a sonication buffer [50 mM Tris-HCI, pH 8.0, 0.2 mM 
2-mercaptoethanol, 10% glycerol, 2.4 mM PMSF (phenylmethanesulfonyl fluoride)] and applied to an ultrasonic dis- 

w rupter. Forty-two milliliters of a crude extract was recovered as a supernatant by centrifugation at 1 2,000 rpm for 1 0 min- 
utes, which was then heat-treated at 80°C for 15 minutes. Centrifugation was again carried out at 12,000 rpm for 10 
minutes to yield 33 ml of a heat-treated enzyme solution. The above solution was then dialyzed with 800 ml of buffer A 
[50 mM potassium phosphate, pH 6.5, 2 mM 2-mercaptoethanol, 10% glycerol] as an external dialysis liquid for 2 hours 
x 4. After dialysis, 32 ml of the enzyme solution was applied to a RESOURCE Q column (manufactured by Pharmacia) 

is which was previously equilibrated with buffer A, and subjected to chromatography using an FPLC system (manufac- 
tured by Pharmacia). A development of chromatogram was carried out on a linear concentration gradient from 0 to 500 
mM NaCI. A fraction having a DNA polymerase activity was eluted at 340 mM NaCI. 

Ten milliliters of an enzyme solution obtained by collecting as an active fraction was desalted and concentrated by 
ultrafiltration, and dissolved in buffer A + 1 50 mM NaCI to yield 3.5 ml of an enzyme solution. The resulting enzyme solu- 

20 tion was then applied to a Hi Trap Heparin column (manufactured by Pharmacia), previously equilibrated with the same 
buffer. A chromatogram was developed on a linear concentration gradient from 150 to 650 mM NaCI using an FPLC 
system, to yield an active fraction eluted at 400 mM NaCI. Five milliliters of this fraction was concentrated to 120 uJ of 
a solution including 50 mM potassium phosphate, pH 6.5, 2 mM 2-mercaptoethanol, and 75 mM NaCI by repeating 
desalting and concentration using ultrafiltration. The resulting concentrated solution was then applied to a gel filtration 

25 column of Superose 6 (manufactured by Pharmacia), previously equilibrated with the same buffer, and eluted with the 
same buffer. As a result, a fraction having a DNA polymerase activity was eluted at positions corresponding to retention 
times of 34.7 minutes and 38.3 minutes. It is suggested from the results of comparison with the elution position of 
molecular weight markers under the same conditions that these activity peaks have molecular weights of about 385 kDa 
and about 220 kDa, respectively. These molecular weights corresponded to a complex formed by the translation prod- 

30 uct of ORF3 and the translation product of ORF4 in a molar ratio of 1 :2 and another complex formed by the above trans- 
lation products in a molar ratio of 1 :1 , respectively. For the former peak, however, since a possibility that a complex is 
formed by the two translation products in a 2:2 molar ratio cannot be negated, the molecular weight determination error 
increases as the molecular weight increases. 

35 Example 4 

m Biochemical Properties of DNA Polymerase 

For a DNA polymerase preparation forming a complex of the translation products of ORF3 and ORF4 obtained in 
40 Example 3, namely the first DNA polymerase-constituting protein and the second DNA polymerase-constituting protein 
in a ratio at 1:1, optimum MgCI 2 and KCI concentrations were firstly assayed. The DNA polymerase activity was 
assayed in a reaction system containing 20 mM Tris-HCI, pH 7.7, 2 mM 2-mercaptoethanol, 0.2 mg/ml activated DNA, 
and 40 ^M each of dATP, dGTP, dCTP and dTTP in the presence of 2 mM MgCI 2 , while the KCI concentration was step 
by step increased from 0 to 200 mM KCI for each 20 mM increment. As a result, the maximum activity was exhibited at 
45 a KCI concentration of 60 mM. Next, the DNA polymerase activity was assayed in the same reaction system but in the 
presence of 60 mM KCI in this time, while the MgCI 2 concentration was step by step increased from 0.5 to 25 mM MgCI 2 
for each 2.5 mM increment, to compare at each concentration. In this case, the maximum activity was exhibited at an 
MgCI 2 concentration of 1 0 mM, and alternatively, in the absence of KCI, the maximum activity was exhibited at an MgCI 2 
concentration of 17.5 mM. 

so The optimum pH was then assayed. The DNA polymerase activity was assayed at 75°C by using potassium phos- 
phate buffers at various pH levels, and preparing a reaction mixture comprising 20 mM potassium phosphate, 15 mM 
MgCI 2 , 2 mM 2-mercaptoethanol, 0.2 mg/ml activated DNA, 40 *iM each of dATP, dCTP, dGTP and dTTP, and 60 nM 
[ 3 H]-dTTP. The results are shown in FIG. 3, wherein the abscissa indicates the pH, and the ordinate indicates the radi- 
oactivity incorporated in high-molecular DNA. As shown in the figure, the DNA polymerase of the present invention 

55 exhibited the maximum activity at a pH of 6.5 to 7.0. When Tris-HCI was used in place of potassium phosphate, the 
activity increased with alkalinity, and the maximum activity was exhibited at a pH of 8.02, the highest pH level used in 
the assay. 

The heat stability of the DNA polymerase of the present invention was assayed as follows: The purified DNA 
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polymerase was prepared to yield a mixture containing 20 mM Tris-HCI (pH 7.7), 2 mM 2-mercaptoethanol, 10% glyc- 
erol, and 0.1% bovine serum albumin, and the resulting mixture was incubated at various temperatures for 30 minutes. 
The remaining DNA polymerase activity was assayed. The results are shown in FIG. 4, wherein the abscissa indicates 
the incubation temperature, and the ordinate indicates the remaining activity. As shown in the figure, the present 

s enzyme retained not less than 80% remaining activity even after heat treatment at 80°C for 30 minutes. 

In order to compare the modes of inhibition by inhibitors, the modes of inhibition of the DNA polymerase of the 
present invention and an a-type DNA polymerase derived from Pyrococcus furiosus (Pfu DNA polymerase, manufac- 
tured by Stratagene), a known DNA polymerase, were compared using a specific inhibitor of a-type DNA polymerases, 
aphidicolin. The activity changes were examined, while the aphidicolin concentration was increased from 0 to 2.0 mM 

w in the presence of 20 mM Tris-HCI, pH 7.7, 15 mM MgCI 2 , 2 mM 2-mercaptoethanol, 0.2 mg/ml activated DNA, and 40 
nM each of dATR dGTP, dCTP and dTTP As a result, the activity of the Pfu DNA polymerase was decreased to 20% of 
the original activity at 1 .0 mM, while the novel DNA polymerase of the present invention was not inhibited at all even at 
2.0 mM. 

75 (2) Primer Extension Reaction 

Next, in order to compare the selectivity of the DNA polymerase of the present invention for different forms of sub- 
strate DNA, the following template-primer was examined. Aside from the activated DNA used for conventional assay of 
the activity, those prepared as substrates include a thermal-denatured DNA prepared by treating the activated DNA at 

20 85°C for 5 minutes; M13-HT Primer prepared by annealing a 45-base synthetic deoxyribooligonucleotide of the 
sequence as shown by SEQ ID NO:6 in Sequence Listing as a primer to the M13 phage single stranded DNA 
(M13mp18 ssDNA, manufactured by Takara Shuzo Co., Ltd.); M13-RNA Primer prepared by annealing a 17-base syn- 
thetic ribooltgonucleotide of the sequence as shown by SEQ ID NO:7 in Sequence Listing as a primer to the same M13 
phage single stranded DNA; Poly dA-Oligo dT prepared by mixing polydeoxyadenosine (Poly dA, manufactured by 

25 Pharmacia) and oligodeoxythymidine (Oligo dT, manufactured by Pharmacia) in a 20:1 molar ratio; and Poly A-Oligo dT 
prepared by mixing polyadenosine (Poly A, manufactured by Pharmacia) and oligodeoxythymidine in a 20:1 molar ratio. 

The DNA polymerase activity was assayed using these substrates in place of the activated DNA. The relative activ- 
ity of each substrate when the activity obtained in the case of using an activated DNA as a substrate is defined as 100 
is shown in Table 1 . For comparison, the Pfu DNA polymerase and the Taq DNA polymerase derived from Thermus 

30 aquaticus (TaKaRa Taq, manufactured by Takara Shuzo Co., Ltd.) were also examined in the same mariner. As shown 
in Table 1 , in comparison with other DNA polymerases, the novel DNA polymerase of the present invention exhibited 
higher activity when the substrate used was the M1 3-HT Primer rather than the activated DNA, demonstrating that the 
novel DNA polymerase of the present invention is especially suitable for primer extension reaction. 

The primer extension activity was further investigated extensively. The M13-HT Primer, previously labeled with [y- 

35 32 P]-ATP (manufactured by Amersham) and T4 polynucleotide kinase (manufactured by Takara Shuzo Co., Ltd.) at the 
S'-end, was used as a substrate. Ten microliters of a reaction mixture [20 mM Tris-HCI (pH 7.7), 15 mM MgCI 2 , 2 mM 
2-mercaptoethanol, 270 \iM each of dATP, dGTP, dCTP and dTTP] containing the above substrate in a final concentra- 
tion of 0.05 jig/^l and various DNA polymerases in amounts providing 0.05 units of activity as assayed with the activated 
DNA as a substrate was reacted at 75°C for 1, 2, 3, or 4 minutes. After termination of the reaction, 2 nl of a reaction 

40 stop solution (95% formaldehyde, 20 mM EDTA, 0.05% bromophenol blue, 0.05% xylenecyanol) was added, subjected 
to thermal denaturation treatment at 95°C for 3 minutes. Two microliters of the reaction mixture was then subjected to 
electrophoresis using polyacrylamide gel containing 8 M urea and subsequently subjected to a preparation of autoradi- 
ogram. Also, in order to examine the extension activity in the presence of the activated DNA as a competitor substrate, 
the activated DNA was added to the above reaction mixture to a final concentration of 0.4 ng/ml, and subjected to a 

45 preparation of an autoradiogram by the same procedures as described above. The autoradiogram obtained is shown in 
FIG. 6. 

In the figure, Pol, Pfu, and Taq show the results for the DNA polymerase of the present invention, the Pfu DNA 
polymerase and the Taq DNA polymerase, respectively. In addition, 1, 2, 3, and 4 each indicates reaction time (min). In 
the figure, the representation and V show the results obtained in the absence and in the presence, respectively, of 

so the activated DNA. The lanes G, A, T, and C at the left end of the figure also show the results of electrophoresis of the 
reaction products obtained by a chain termination reaction by the dideoxy method using the same substrate as above, 
which were used to estimate the length of each extension product. As shown in the figure, the DNA polymerase of the 
present invention exhibited superior primer extension activity than those of the Pfu DNA polymerase and the Taq DNA 
polymerase. It was also shown that the DNA polymerase of the present invention was unlikely to be inhibited by the acti- 

55 vated DNA, in contrast to the Taq DNA polymerase, which exhibited relatively higher primer extension activity in the 
absence of the activated DNA, was markedly inhibited by the addition of the activated DNA in great excess. From the 
above finding, it was confirmed that the DNA polymerase of the present invention possesses high affinity especially to 
primer extension type substrates having a form in which a single primer was annealed to a single stranded template 
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DNA. 

(3) Presence qr AbsQnpe of Associat ed Exonuclease Activity 

5 The exonuclease activity of the DNA polymerase of the present invention was assessed as follows: As a substrate 
for 5'->3' exonuclease activity detection, a DNA fragment labeled with ^P at the 5*-end was prepared by a process 
comprising digesting a pUC119 vector (manufactured by Takara Shuzo Co.. Ltd.) with Ssp\ (manufactured by Takara 
Shuzo Co., Ltd.), separating the resulting 386 bp DNA fragment by agarose gel electrophoresis, purifying the fragment, 
and labeling with [y- 32 P]-ATP and polynucleotide kinase. Also, as a substrate for 3'->5' exonuclease activity detection, 

10 a DNA fragment labeled with 32 P at 3'-end was prepared by a process comprising digesting a pUCl 19 vector with 
Sau3AI, separating the resulting 341 bp DNA fragment by agarose gel electrophoresis, purifying the fragment, and car- 
rying out a fill-in reaction using [y- 32 P]-CTP (manufactured by Amersham) and the Klenow fragment (manufactured by 
Takara Shuzo Co., Ltd.). The labeled DNAs were purified by gel filtration with NICK COLUMN (manufactured by Phar- 
macia) and used in the subsequent reaction. To a reaction solution [20 mM Tris-HCI (pH 7.7), 15 mM MgCI 2 , 2 mM 2- 

15 mercaptoethanol] containing 1 ng of these labeled DNAs, 0.01 5 units of DNA polymerase was added, and the resulting 
mixture was reacted at 75°C for 2.5, 5, and 7.5 minutes. The DNAs were precipitated by adding ethanol. The radioac- 
tivity existing in the supernatant was assayed using a liquid scintillation counter, and the amount of degradation by the 
exonuclease activity was calculated. The DNA polymerase of the present invention was shown to possess potent 3'->5' 
exonuclease activity, while no 5^3' exonuclease activity was observed. The 3'->5* exonuclease activity of the Pfu DNA 

20 polymerase, known to possess potent 3'->5 l exonuclease activity, was also assayed in the same manner as above. The 
results are together shown in FIG. 5. 

In the figure, the abscissa indicates the reaction time, and the ordinate indicates the ratio of radioactivity released 
into the supernatant relative to the radioactivity contained in the entire reaction mixture. Also, the open circles indicate 
the results for the DNA polymerase of the present invention, and the solid circles indicate those for the Pfu DNA 

25 polymerase. As shown in the figure, the DNA polymerase of the present invention showed potent 3'->5' exonuclease 
activity of the same level as that of the Pfu DNA polymerase, known to possess high accuracy of DNA synthesis owing 
to high 3'->5' exonuclease activity. 

(4) Comparison pf Accuracy fff DNA $vnthe$i$ Reqption 

30 

The accuracy of DNA synthesis reaction by DNA polymerases was examined using a pUC1 18 vector (manufac- 
tured by Takara Shuzo Co., Ltd.), partially made single stranded (gapped duplex plasmid, as a template. The single 
stranded pUC1 18 vector was prepared by the method described in Molecular Cloning: A Laboratory Manual, 2nd ed., 
4.44-4.48, published by Cold Spring Harbor Laboratory in 1989, edited by T Maniatis et al., using a helper phage 
35 M13K07 (manufactured by Takara Shuzo Co., Ltd.) with Escherichia coli MV1 184 (manufactured by Takara Shuzo Co., 
Ltd.) as a host. The double stranded DNA was prepared by digesting the pUC1 18 vector with PvuW (manufactured by 
Takara Shuzo Co., Ltd.), subjecting the digested vector to agarose gel electrophoresis, and recovering a DNA fragment 
of about 2.8 kbp. 

One microgram of the above single stranded DNA and 2 ug of the double stranded DNA were mixed to make 180 
40 ii\ of a mixture with sterile distilled water, and the solution was then incubated at 70°C for 1 0 minutes. Thereafter, twenty 
microliters of 20 x SSC was added to the resulting mixture, and the mixture was further kept standing at 60°C for 10 
minutes. The DNA was recovered by subjecting to ethanol precipitation. A portion thereof was subjected to agarose gel 
electrophoresis, and it was confirmed that a gapped duplex plasmid was obtained. Thirty microliters of a reaction mix- 
ture [10 mM Tris-HCI, pH 8.5, 50 mM KCI, 10 mM MgCI 2 , 1 mM each of dATP, dCTP, dGTP and dTTP], containing an 
45 amount one-tenth that of the resulting gapped duplex plasmid was incubated at 70°C for 3 minutes, after which 0.5 units 
of DNA polymerase was added thereto, and a DNA synthesis reaction was carried out at 70°C for 10 minutes. After ter- 
mination of the reaction, Escherichia coli DH5a (manufactured by BRL) was transformed using 10 uJ of the reaction 
mixture. The resulting transformant was cultured at 37°C for 18 hours on an LB plate containing 100 ng/ml ampicillin, 
0.1 mM IPTG, and 40 jig/ml 5-bromo-4-chloro-3-indolyl-p-D-galactoside. The white or blue colonies formed on the plate 
50 were counted, and the formation rate of white colonies which were resulted from a DNA synthesis error was calculated. 
As a result, the white colony formation rate (%) was 3.18% when the Taq DNA polymerase was used as the DNA 
polymerase, in contrast to a lower formation rate of 1.61% when the DNA polymerase of the present invention was 
used. 

55 (§) ApplirafantpPCR 

In order to compare the performance of the DNA polymerase of the present invention in PCR with that of the Taq 
DNA polymerase, PCR was carried out with X-DNA as a template. The reaction mixture for the DNA polymerase of the 
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present invention had the following composition: 10 mM Tris-HCi (pH 9.2), 3.5 mM MgQ 2 . 7 § mM KCI, 400 uM each of 
dATP, dCTP, dGTP and dTTP, 0.01% bovine serum albumin (BSA), and 0.1% Triton X-100. The reaction solution for the 
Taq DNA polymerase had the following composition: 10 mM Tris-HCI (pH 8.3), 1 .5 mM MgCI 2 , 50 mM KCI, and 400 uM 
each of dATP, dCTP, dGTP and dTTP. Fifty microliters of a reaction mixture containing 5.0 ng/50 nl A.-DNA (manufac- 

5 tured by Takara Shuzo Co., Ltd.), 10 pmd/50 ul each of primer X1 and primer X1 1 , and 3.7 units/50 ul DNA polymerase 
was prepared. The base sequences of the primer and the primer M 1 are shown by SEQ ID NO:8 and SEQ ID NO:9 
in Sequence Listing, respectively. After, a 30-cycle PCR was carried out with the above reaction mixture, wherein one 
cycle is defined at 98°C for 10 seconds and at 68°C for 10 seconds. Five microliters of the reaction mixture was sub- 
jected to agarose gel electrophoresis, and the amplified DNA fragment was confirmed by staining with ethidium bro- 

w mide. As a result, it was demonstrated that the DNA fragment amplification was not found when the Taq DNA 
polymerase was used, in contrast to the DNA polymerase of the present invention where amplification of a DNA frag- 
ment of about 20 kbp was confirmed. 

The experiment was then carried out by changing the primer to the primer X1 and the primer MO. The base 
sequence of the primer M0 is shown by SEQ ID NO:10 in Sequence Listing. Twenty-five microliters of a reaction mix- 

15 ture having a similar composition to that shown above and containing 2.5 ng of A.-DNA, 10 pmpl of the primer A.1 and 
the primer VI 0, respectively, and 3.7 units of DNA polymerase was prepared. The reaction mixture was reacted in 5 
cycles under the same reaction conditions as those described above, and 5 pJ of the reaction mixture was subjected to 
agarose gel electrophoresis and stained with ethidium bromide. It was demonstrated that no specific amplification was 
observed when the Taq DNA polymerase was used, in contrast to the DNA polymerase of the present invention where 

20 a D NA fragment of about 1 5 kbp was amplified. 

Example 5 

m Construction of Plasmid for Expression of ORF3 Translation Product Alone 

25 

PCR was carried out using a mutant plasmid 6-82 as a template, the mutant plasmid being prepared by deleting 
the portion immediately downstream of the ORF3 from the DNA insert in the plasmid pFU1002 described in Example 
2, wherein the ORF1 to the ORF6 were located downstream of the lac promoter on the vector and also using a primer 
M4 (manufactured by Takara Shuzo Co., Ltd) and the primer N03 whose base sequence is shown by SEQ ID:1 1 in 

30 Sequence Listing. The DNA polymerase used for the PCR was the Pf u DNA polymerase (manufactured by Stratagene), 
which possessed high accuracy of synthesis reaction. A 25-cycle reaction of 100 uJ of a reaction mixture for PCR [20 
mM Tris-HCI, pH 8.2, 10 mM KCI. 20 mM MgCI 2 , 6 mM (NH 4 ) 2 S0 4 , 0.2 mM each of dATP, dCTP, dGTP and dTTP, 1% 
Triton X-100, 0.01% BSA] containing 1 ng of a template DNA, 25 pmol of each primer, and 2.5 units of the Pfu DNA 
polymerase was carried out, wherein one cycle is defined as at 94°C for 0.5 minutes, at 55°C for 0.5 minutes and at 

35 72°C for 2 minutes. The amplified DNA fragment of about 2 kbp was digested with A/col and Sph\ (each manufactured 
by Takara Shuzo Co., Ltd.) and inserted into between the Nco\-Sph\ sites of the pTV118N vector (manufactured by 
Takara Shuzo Co., Ltd.) to prepare a plasmid pFU-ORF3. The DNA insert in the above plasmid contains ORF3 alone 
in translatable conditions. 

40 (2) Construction of Plasmid for Expression of ORF4 Translation Product Alone 

PCR was carried out using a mutant plasmid 6-2 as a template, the mutant plasmid being prepared by deleting the 
portion downstream of the center portion of the ORF4 from the DNA insert in the above-described plasmid pFU1002, 
the primer M4, and the primer N04 of which the base sequence is shown by SEQ ID NO:12 in Sequence Listing. The 

45 reaction was carried out under the same conditions as those for Example 5-(1) described above, except that the tem- 
plate DNA was replaced with the plasmid 6-2, and the primer N03 was replaced with the primer N04. A DNA fragment 
of about 1 .6 kbp obtained by digesting the above amplified DNA fragment with Nco\ and Nhe\ (manufactured by Takara 
Shuzo Co., Ltd.), together with an about 3.3 Map Nhel-Sal fragment, including the latter portion of ORF4, isolated from 
the above plasmid pFLM 002 was inserted between the Nco\-Xho\ sites of a pET1 5b vector (manufactured by Novagen) 

so to prepare a plasmid pFU-ORF4. The DNA insert in the plasmid contains ORF4 alone in translatable conditions. . 

(3) Reconstitute of DNA Polymerase with ORF3 and ORF4 Translation Products 

The Escherichia coli JM109 transformed with the above-described plasmid pFU-ORF3, Escherichia coli 
55 JM109^FU-ORF3. and the Escherichia coli HMS174 transformed with the above-described plasmid pFU-ORF4, 
Escherichia coli HMS174/0FU-ORF4, were separately cultured, and then the translation products of the two ORFs 
expressed in their cells were purified. The cultivation of the transformants and the preparation of the crude extracts were 
carried out by the methods described in Example 3. Purification of both translation products was carried out using col- 
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umns such as RESOURCE Q, HiTrap Heparin, and Superose 6, white the behaviors of the translation products on SDS- 
PAGE were monitored, ft was confirmed that although neither of the ORF translation products thus purified exhibited 
the DNA polymerase activity when assayed alone, thermostable DNA polymerase activity was exhibited when they 
were mixed together. 

INDUSTRIAL APPLICABILITY 

The present invention can provide a novel DNA polymerase possessing both high primer extensibility and high 
3'->5' exonuclease activity. The enzyme is suitable for its use for PCR method, which is useful for a reagent for genetic 
engineering investigation. It is also possible to produce the enzyme by genetic engineering using the genes encoding 
the DNA polymerase of the present invention. 
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SEQUENCE LISTING 

SEQ ID NO:l 
SEQUENCE LENGTH: 613 
SEQUENCE TYPE: amino acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULAR TYPE: peptide 
SEQUENCE DESCRIPTION: 

Met Asp Glu Phe Val Lys Ser Leu Leu Lys Ala Asn Tyr Leu lie 

5 10 15 

Thr Pro Ser Ala Tyr Tyr Leu Leu Arg Glu Tyr Tyr Glu Lys Gly 

20 25 30 

Glu Phe Ser lie Val Glu Leu Val Lys Phe Ala Arg Ser Arg Glu 

35 40 45 

Ser Tyr lie lie Thr Asp Ala Leu Ala Thr Glu Phe Leu Lys Val 

50 55 60 

Lys Gly Leu Glu Pro lie Leu Pro Val Glu Thr Lys Gly Gly Phe 

65 70 75 

Val Ser Thr Gly Glu Ser Gin Lys Glu Gin Ser Tyr Glu Glu Ser 

80 85 90 

Phe Gly Thr Lys Glu Glu lie Ser Gin Glu lie Lys Glu Gly Glu 

95 100 105 

Ser Phe lie Ser Thr Gly Ser Glu Pro Leu Glu Glu Glu Leu Asn 

110 115 120 

Ser He Gly He Glu Glu He Gly Ala Asn Glu Glu Leu Val Ser 

125 130 135 

Asn Gly Asn Asp Asn Gly Gly Glu Ala He Val Phe Asp Lys Tyr 

140 145 150 

Gly Tyr Pro Met Val Tyr Ala Pro Glu Glu He Glu Val Glu Glu 

155 160 165 

Lys Glu Tyr Ser Lys Tyr Glu Asp Leu Thr He Pro Met Asn Pro 

170 175 180 

Asp Phe Asn Tyr Val Glu He Lys Glu Asp Tyr Asp Val Val Phe 
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Asp Val Arg 
Asn Gly Lys 
Phe Arg Ser 
Pro Glu Leu 
Lys Glu Asp 
Arg Glu Val 
Gly Lys Val 
Glu Ala Phe 
Val Tyr Ser 
Pro Asp Val 
Lys Val Tyr 
Glu Phe Cys 
Gly Asn Val 
Lys Tyr Leu 
Tyr Pro Gly 
Gin Tyr Glu 
lie Thr Met 



185 
Asn Val Lys 

200 
Glu Gly Glu 

215 
Arg Leu Lys 

230 
Asp Asn Val 

245 
Glu Thr Val 

260 
Asn Lys Gly 

275 
Lys Val Phe 

290 
Lys Val Leu 

305 
Lys Ar*g Gly 

320 
Pro Leu Tyr 

335 
Ala lie Leu 

350 
Glu Asn Ala 

365 
Glu Thr Lys 

380 
lie lie Ala 

395 
Gin Tyr Ala 

410 
Ala Leu Ala 

425 
Phe lie Ala 
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Leu Lys Pro 
He He Val 
Lys Leu Arg 
Val Asp He 
Thr He He 
Leu He Phe 
Leu Pro Lys 
Pro Asp Ala 
He Leu Tyr 
Arg Arg Gin 
He Ser Asp 
Phe He Lys 
Glu Glu Glu 
Gly Asp Val 
Asp Leu Thr 
Asn Leu Leu 
Pro Gly Asn 



190 

Pro Lys Val 
205 

Glu Ala Tyr 
220 

Lys He Leu 
235 

Gly Lys Leu 
250 

Gly Leu Val 
265 

Glu He Glu 
280 

Asp Ser Glu 
295 

Val Val Ala 
310 

Ala Asn Lys 
325 

Lys Pro Pro 
340 

He His Val 
355 

Phe Leu Glu 
370 

Glu He Val 
385 

Val Asp Gly 
400 

He Pro Asp 
415 

Ser His Val 
430 

His Asp Ala 



195 

Lys Asn Gly 
210 

Ala Ser Leu 
225 

Arg Glu Asn 
240 

Lys Tyr Val 
255 

Asn Ser Lys 
270 

Asp Leu Thr 
285 

Asp Tyr Arg 
300 

Phe Lys Gly 
315 

Phe Tyr Leu 
330 

Leu Glu Glu 
345 

Gly Ser Lys 
360 

Trp Leu Asn 
375 

Ser Arg Val 
390 

Val Gly Val 
405 

He Phe Asp 
420 

Pro Lys His 
435 

Ala Arg Gin 
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440 445 450 

Ala lie Pro Gin Pro Glu Phe Tyr Lys Glu Tyr Ala Lys Pro lie 
5 455 460 465 

Tyr Lys Leu Lys Asn Ala Val lie lie Ser Asn Pro Ala Val lie 
470 475 480 

Arg Leu His Gly Arg Asp Phe Leu lie Ala His Gly Arg Gly lie 
10 485 490 495 

Glu Asp Val Val Gly Ser Val Pro Gly Leu Thr His His Lys Pro 
500 505 510 

Gly Leu Pro Met Val Glu Leu Leu Lys Met Arg His Val Ala Pro 
515 520 525 

Met Phe Gly Gly Lys Val Pro lie Ala Pro Asp Pro Glu Asp Leu 
530 535 540 

Leu Val lie Glu Glu Val Pro Asp Val Val His Met Gly His Val 
545 550 555 

His Val Tyr Asp Ala Val Val Tyr Arg Gly Val Gin Leu Val Asn 
560 565 570 

25 Ser Ala Thr Trp Gin Ala Gin Thr Glu Phe Gin Lys Met Val Asn 

575 580 585 

lie Val Pro Thr Pro Ala Lys Val Pro Val Val Asp lie Asp Thr 
590 595 600 

Ala Lys Val Val Lys Val Leu Asp Phe Ser Gly Trp Cys 
605 610 
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SEQ ID NO: 2 
SEQUENCE LENGTH: 1839 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
40 TOPOLOGY: linear 

MOLECULAR TYPE: Genomic DNA 
SEQUENCE DESCRIPTION: 

ATGGATGAAT TTGTAAAATC ACTTCTAAAA GCTAACTATC TAATAACTCC CTCTGCCTAC 60 
TATCTCTTGA GAGAATACTA TGAAAAAGGT GAATTCTCAA TTGTGGAGCT GGTAAAATTT 120 
GCAAGATCAA GAGAGAGCTA CATAATTACT GATGCTTTAG CAACAGAATT CCTTAAAGTT 180 



50 



45 



55 
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AAAGGCCTTG AACCAATTCT TCCAGTGGAA ACAAAGGGGG GTTTTGTTTC CACTGGAGAG 240 
TCCCAAAAAG AGCAGTCTTA TGAAGAGTCT TTTGGGACTA AAGAAGAAAT TTCCCAGGAG 300 

5 ATTAAAGAAG GAGAGAGTTT TATTTCCACT GGAAGTGAAC CACTTGAAGA GGAGCTCAAT 360 

AGCATTGGAA TTGAGGAAAT TGGGGCAAAT GAAGAGTTAG TTTCTAATGG AAATGACAAT 420 
GGTGGAGAGG CAATTGTCTT TGACAAATAT GGCTATCCAA TGGTATATGC TCCAGAAGAA 480 
ATAGAGGTTG AGGAGAAGGA GTACTCGAAG TATGAAGATC TGACAATACC CATGAACCCC 540 

10 GACTTCAATT ATGTGGAAAT AAAGGAAGAT TATGATGTTG TCTTCGATGT TAGGAATGTA 600 

AAGCTGAAGC CTCCTAAGGT AAAGAACGGT- AATGGGAAGG AAGGTGAAAT AATTGTTGAA 660 
GCTTATGCTT CTCTCTTCAG GAGTAGGTTG AAGAAGTTAA GG AAAATACT AAGGGAAAAT 720 
CCTGAATTGG ACAATGTTGT TGATATTGGG AAGCTGAAGT ATGTGAAGGA AGATGAAACC 780 

15 

GTGACAATAA TAGGGCTTGT CAATTCCAAG AGGGAAGTGA ATAAAGGATT GATATTTGAA 840. 
ATAGAAGATC TCACAGGAAA GGTTAAAGTT TTCTTGCCGA AAGATTCGGA AGATTATAGG 900 
GAGGCATTTA AGGTTCTTCC AGATGCCGTC GTCGCTTTTA AGGGGGTGTA TTCAAAGAGG 960 

23 GGAATTTTGT ACGCCAACAA GTTTTACCTT CCAGACGTTC CCCTCTATAG GAGACAAAAG 1020 

CCTCCACTGG AAGAGAAAGT TTATGCTATT CTCATAAGTG ATATACACGT CGGAAGTAAA 1080 
GAGTTCTGCG AAAATGCCTT CATAAAGTTC TTAGAGTGGC TCAATGGAAA CGTTGAAACT 1140 
AAGGAAGAGG AAGAAATCGT GAGTAGGGTT AAGTATCTAA TCATTGCAGG AGATGTTGTT 1200 

25 GATGGTGTTG GCGTTTATCC GGGCCAGTAT GCCGACTTGA CGATTCCAGA TATATTCGAC 1260 

CAGTATGAGG CCCTCGCAAA CCTTCTCTCT CACGTTCCTA AGCACATAAC AATGTTCATT 1320 
GCCCCAGGAA ACCACGATGC TGCTAGGCAA GCTATTCCCC AACCAGAATT CTACAAAGAG 1380 
TATGCAAAAC CTATATACAA GCTCAAGAAC GCCGTGATAA TAAGCAATCC TGCTGTAATA 1440 

30 AGACTACATG GTAGGGACTT TCTGATAGCT CATGGTAGGG GGATAGAGGA TGTCGTTGGA 1500 

AGTGTTCCTG GGTTGACCCA TCACAAGCCC GGCCTCCCAA TGGTTGAACT ATTGAAGATG 1560 
AGGCATGTAG CTCCAATGTT TGGAGGAAAG GTTCCAATAG CTCCTGATCC AGAAGATTTG 1620 
CTTGTTATAG AAGAAGTTCC TGATGTAGTT CACATGGGTC ACGTTCACGT TTACGATGCG 1680 

35 

GTAGTTTATA GGGGAGTTCA GCTGGTTAAC TCCGCCACCT GGCAGGCTCA GACCGAGTTC 1740 
C AG A AG AT GG TGAACATAGT TCCAACGCCT GCAAAGGTTC CCGTTGTTGA TATTGATACT 1800 
GCAAAAGTTG TCAAGGTTTT GGACTTTAGT GGGTGGTGC 1839 

40 

SEQ ID NO: 3 

SEQUENCE LENGTH: 1263 
SEQUENCE TYPE: amino acid 
45 S TRANDE DN ESS : single 

TOPOLOGY: linear 

50 



55 
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MOLECULAR TYPE: peptide 
SEQUENCE DESCRIPTION: 

Met Glu Leu Pro Lys Glu lie Glu Glu Tyr Phe Glu Met Leu Gin 

5 10 15 
Arg Glu lie Asp Lys Ala Tyr Glu lie Ala Lys Lys Ala Arg Ser 

20 25 30 
Gin Gly Lys Asp Pro Ser Thr Asp Val Glu lie Pro Gin Ala Thr 

35 40 45 

Asp Met Ala Gly Arg Val Glu Ser Leu Val Gly Pro Pro Gly Val 

50 55 60 

Ala Gin Arg lie Arg Glu Leu Leu Lys Glu Tyr Asp Lys Glu lie 

65 70 75 

Val Ala Leu Lys lie Val Asp Glu lie lie Glu Gly Lys Phe Gly 

80 85 90 

Asp Phe Gly Ser Lys Glu Lys Tyr Ala Glu Gin Ala Val Arg Thr 

95 100 105 

Ala Leu Ala lie Leu Thr Glu Gly lie Val Ser Ala Pro Leu Glu 

110 115 120 

Gly lie Ala Asp Val Lys lie Lys Arg Asn Thr Trp Ala Asp Asn 

125 130 135 

Ser Glu Tyr Leu Ala Leu Tyr Tyr Ala Gly Pro lie Arg Ser Ser 

140 145 150 

Gly Gly Thr Ala Gin Ala Leu Ser Val Leu Val Gly Asp Tyr Val 

155 160 165 

Arg Arg Lys Leu Gly Leu Asp Arg Phe Lys Pro Ser Gly Lys His 

170 175 180 

lie Glu Arg Met Val Glu Glu Val Asp Leu Tyr His Arg Ala Val 

185 190 195 

Ser Arg Leu Gin Tyr His Pro Ser Pro Asp Glu Val Arg Leu Ala 

200 205 210 

Met Arg Asn He Pro He Glu He Thr Gly Glu Ala Thr Asp Asp 

215 220 225 

Val Glu Val Ser His Arg Asp Val Glu Gly Val Glu Thr Asn Gin 

230 235 240 
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Leu Arg Gly 
Lys Ala Lys 
Gly Trp Glu 
Glu Glu He 
Glu Thr Arg 
Glu Lys Phe 
Glu He He 
Gly Gly Phe 
Ala Thr Trp 
Phe Leu Ala 
Gly Ala Val 
Lys Leu Lys 
Ala Leu Lys 
Asp Ala He 
Leu Leu Pro 
Val Lys Ala 
Glu Glu Asn 



Gly Ala He 

245 
Lys Leu Val 

260 
Trp Leu Lys 

275 
Glu Glu Ser 

290 
Val Glu Val 

305 
Arg Ala Glu 

320 
Gly Gly Arg 

335 
Arg Leu Arg 

350 
Ser lie. Asn 

365 
He Gly Thr 

380 
Val Thr Pro 

395 
Asp Gly Ser 

410 
He Arg Asp 

425 
He Ala Phe 

440 
Ala Asn Tyr 

455 
Val Asn Glu 

. 470 
Pro Arg Glu 

485 
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Leu Val Leu 
Lys Tyr lie 
Glu Phe Val 
Glu Ser Lys 
Glu Lys Gly 
He Ala Pro 
Pro Leu Phe 
Tyr Gly Arg 
Pro . Ala Thr 
Gin Met Lys 
Ala Thr Thr 
Val Val Arg 
Glu Val Glu 
Gly Asp Phe 
Val Glu Glu 
Ala Tyr Glu 
Ser Val Glu 



Ala Glu Gly 
250 

Asp Lys Met 
265 

Glu Ala Lys 
280 

Ala Glu Glu 
295 

Phe Tyr Tyr 
310 

Ser Glu Lys 
325 

Ala Gly Pro 
340 

Ser Arg Val 
355 

Met Val Leu 
370 

Thr Glu Arg 
385 

Ala Glu Gly 
400 

Val Asp Asp 
415 

Glu He Leu 
430 

Val Glu Asn 
445 

Trp Trp lie 
460 

Val Glu Leu 
475 

Glu Ala Ala 
490 



Val Leu Gin 
255 

Gly He Asp 
270 

Glu Lys Gly 
285 

Ser Lys Val 
300 

Lys Leu Tyr 
315 

Tyr Ala Lys 
330 

Ser Glu Asn 
345 

Ser Gly Phe 
360 

Val Asp Glu 
375 

Pro Gly Lys 
390 

Pro He Val 
405 

Tyr Asn Leu 
420 

Tyr Leu Gly 
435 

Asn Gin Thr 
450 

Gin Glu Phe 
465 

Arg Pro Phe 
480 

Glu Tyr Leu 
495 



23 



Glu Val Asp 
Arg Val Lys 
Leu Glu lie 
Val Asn Pro 
Lys Ala Thr 
Lys Lys lie 
Thr Leu Glu 
Val Val Val 
Gly Asn Leu 
lie Asp lie 
Gly lie Ser 
Lys Glu Arg 
Gly Leu Ala 
Glu Gly Lys 
Lys Cys Gly 
lie Arg Lys 
Tyr Thr Asn 



Pro Glu Phe 

500 
Pro Pro Val 

515 
Pro Leu His 

530 
Lys Asp Val 

545 
lie Glu Trp 

560 
Glu lie Ser 

575 
Leu Leu Gly 

590 
Asp Tyr Pro 

605 

Glu Trp Glu 

620 
lie Asn Glu 

635 
Trp lie Gly 

650 
Lys Met Lys 

665 
Gly Gly Ser 

680 
lie Ala Glu 

695 

His Val Gly 

710 
Glu Leu lie 

725 
Ser Gin Ala 

740 
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Leu Ala Lys 
Glu Leu Ala 
Pro Tyr Tyr 
Glu Arg Leu 
Gly Thr Phe 
Leu Asp Asp 
Leu Pro His 
Trp Ser Ala 
Phe Lys Ala 
Asn Asn Gin 
Ala Arg Met 
Pro Pro Val 
Ser Arg Asp 
Val Glu He 
Pro Glu Thr 
Trp Thr Cys 
Glu Gly Tyr 



Met Leu Tyr 
505 

He His Phe 
520 

Thr Leu Tyr 
535 

Trp Gly Val 
550 

Arg Gly He 
565 

Leu Gly Ser 
580 

Thr Val Arg 
595 

Ala Leu Leu 
610 

Lys Pro Phe 
625 

He Lys Leu 
640 

Gly Arg Pro 
■655 

Gin Val Leu 
670 

He Lys Lys 
685 

Ala Phe Phe 
700 

Leu Cys Pro 
715 

Pro Lys Cys 
730 

Ser Tyr Ser 
745 



Asp Pro Leu 
510 

Ser Glu lie 
525 

Trp Asn Thr 
540 

Leu Lys Asp 
555 

Lys Phe Ala 
570 

Leu Lys Arg 
585 

Glu Gly He 
600 

Thr Pro Leu 
615 

Tyr Thr Val 
630 

Arg Asp Arg 
- 645 
Glu Lys Ala 
660 

Phe Pro He 
675 

Ala Ala Glu 
690 

Lys Cys Pro 
705 

Glu Cys Gly 
720 

Gly Ala Glu 
735 

Cys Pro Lys 
750 
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Cys Asn Val Lys Leu Lys Pro Phe Thr Lys Arg Lys lie Lys Pro 

755 760 765 

Ser Glu Leu Leu Asn Arg Ala Met Glu Asn Val Lys Val Tyr Gly 

770 775 780 

Val Asp Lys Leu Lys Gly Val Met Gly Met Thr Ser Gly Trp Lys 

785 790 795 

lie Ala Glu Pro Leu Glu Lys Gly Leu Leu Arg Ala Lys Asn Glu 

800 805 810 

Val Tyr Val Phe Lys Asp Gly Thr lie Arg Phe Asp Ala Thr Asp 

815 820 825 

Ala Pro lie Thr His Phe Arg Pro Arg Glu lie Gly Val Ser Val 

830 835 840 

Glu Lys Leu Arg Glu Leu Gly Tyr Thr His Asp Phe Glu Gly Lys 

845 850 855 

Pro Leu Val Ser Glu Asp Gin lie Val Glu Leu Lys Pro Gin Asp 

860 865 870 

Val lie Leu Ser Lys Glu Ala Gly Lys Tyr Leu Leu Arg Val Ala 

875 880 885 

Arg Phe Val Asp Asp Leu Leu Glu Lys Phe Tyr Gly Leu Pro Arg 

890 895 900 

Phe Tyr Ash Ala Glu Lys Met Glu Asp Leu lie Gly His Leu Val 

905 910 915 

lie Gly Leu Ala Pro His Thr Ser Ala Gly lie Val Gly Arg lie 

920 925 930 

lie Gly Phe Val Asp Ala Leu Val Gly Tyr Ala His Pro Tyr. Phe 

935 940 945 

His Ala Ala Lys Arg Arg Asn Cys Asp Gly Asp Glu Asp Ser Val 

950 955 960 

Met Leu Leu Leu Asp Ala Leu Leu Asn Phe Ser Arg Tyr Tyr Leu 

965 970 975 

Pro Glu Lys Arg Gly Gly Lys Met Asp Ala Pro Leu Val lie Thr 

980 985 990 

Thr Arg Leu Asp Pro Arg Glu Val Asp Ser Glu Val His Asn Met 

995 1000 1005 
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Asp Val Val Arg Tyr Tyr Pro Leu Glu Phe Tyr Glu Ala Thr Tyr 

1010 1015 1020 

Glu Leu Lys Ser Pro Lys Glu Leu Val Arg Val lie Glu Gly Val 

1025 1030 1035 

Glu Asp Arg Leu Gly Lys Pro Glu Met Tyr Tyr Gly lie Lys Phe 

1040 1045 1050 

Thr His Asp Thr Asp Asp lie Ala Leu Gly Pro Lys Met Ser Leu 

1055 1060 1065 

Tyr Lys Gin Leu Gly Asp Met Glu Glu Lys Val Lys Arg Gin Leu 

1070 1075 1080 

Thr Leu Ala Glu Arg lie Arg Ala Val Asp Gin His Tyr Val Ala 

1085 1090 1095 

Glu Thr lie Leu Asn Ser His Leu lie Pro Asp Leu Arg Gly Asn 

1100 1105 1110 

Leu Arg Ser Phe Thr Arg Gin Glu Phe Arg Cys Val Lys Cys Asn 

1115 1120 • 1125 

Thr Lys Tyr Arg Arg Pro Pro Leu Asp Gly Lys Cys Pro Val Cys 

1130 1135 1140 

Gly Gly Lys lie Val Leu Thr Val Ser Lys Gly Ala lie Glu Lys 

1145 1150 1155 

Tyr Leu Gly Thr Ala. Lys Met Leu Val Ala Asn Tyr Asn Val Lys 
. 1160 1165 1170 

Pro Tyr Thr Arg Gin Arg lie Cys Leu Thr Glu Lys Asp lie Asp 

1175 1180 1185 

Ser Leu Phe Glu Tyr Leu Phe Pro Glu Ala Gin Leu Thr Leu lie 

1190 1195 ' 1200 

Val Asp Pro Asn Asp lie Cys Met Lys Met lie Lys Glu Arg Thr 

1205 1210 1215 

Gly Glu Thr Val Gin Gly Gly Leu Leu Glu Asn Phe Asn Ser Ser 

1220 1225 1230 

Gly Asn Asn Gly Lys Lys lie Glu Lys Lys Glu Lys Lys Ala Lys 

1235 1240 1245 

Glu Lys Pro Lys Lys Lys Lys Val lie Ser Leu Asp Asp Phe Phe 

1250 1255 1260 
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Ser Lys Arg 

5 SEQ ID NO: 4 

SEQUENCE LENGTH: 3789 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 
'0 TOPOLOGY: linear 

MOLECULAR TYPE: Genomic DNA 

SEQUENCE DESCRIPTION: 

ATGGAGCTTC CAAAGGAAAT TGAGGAGTAT TTTGAGATGC TTCAAAGGGA AATTGACAAA 60 
15 GCTTACGAGA TTGCTAAGAA GGCTAGGAGT CAGGGTAAAG ACCCCTCAAC CGATGTTGAG 120 

ATTCCCCAGG CTACAGACAT GGCTGGAAGA GTTGAGAGCT TAGTTGGCCC TCCCGGAGTT 180 
GCTCAGAGAA TTAGGGAGCT TTTAAAAGAG TATGATAAGG AAATTGTTGC TTTAAAGATA 240 
GTTGATGAGA TAATTGAGGG CAAATTTGGT GATTTTGGAA GTAAAGAGAA GTACGCTGAA 300 

20 

CAGGCTGTAA GGACAGCCTT GGCAATATTA ACTGAGGGTA TTGTTTCTGC TCCACTTGAG 360 
GGTATAGCTG ATGTTAAAAT CAAGCGAAAC ACCTGGGCTG ATAACTCTGA ATACCTCGCC 420 
CTTTACTATG CTGGGCCAAT TAGGAGTTCT GGTGGAACTG CTCAAGCTCT CAGTGTACTT 480 

25 GTTGGTGATT ACGTTAGGCG AAAGCTTGGC CTTGATAGGT TTAAGCCAAG TGGGAAGCAT 540 

ATAGAGAGAA TGGTTGAGGA AGTTGACCTC TATCATAGAG CTGTTTCAAG GCTTCAATAT 600 
. CATCCCTCAC CTGATGAAGT GAGATTAGCA ATGAGGAATA TTCCCATAGA AATCACTGGT 660 
GAAGCCACTG ACGATGTGGA GGTTTCCCAT AGAGATGTAG AGGGAGTTGA GACAAATCAG 720 

30 CTGAGAGGAG GAGCGATCCT AGTTTTGGCG GAGGGTGTTC TCCAGAAGGC TAAAAAGCTC 780 

GTGAAATACA TTGACAAGAT GGGGATTGAT GG ATGGGAGT GGCTTAAAGA GTTTGTAGAG 840 
GCTAAAGAAA AAGGTGAAGA AATCGAAGAG AGTGAAAGTA AAGCCGAGGA GTCAAAAGTT 900 
GAAACAAGGG TGGAGGTAGA GAAGGGATTC TACTACAAGC TCTATGAGAA ATTTAGGGCT 960 

35 

GAGATTGCCC CAAGCGAAAA GTATGCAAAG GAAATAATTG GTGGGAGGCC GTTATTCGCT 1020 
GGACCCTCGG AAAATGGGGG ATTTAGGCTT AGATATGGTA GAAGTAGGGT GAGTGGATTT 1080 
GCAACATGGA GCATAAATCC AGCAACAATG GTTTTGGTTG ACGAGTTCTT GGCCATTGGA 1140 

40 ACTCAAATGA AAACCGAGAG GCCTGGGAAA GGTGCAGTAG TGACTCCAGC AACAACCGCT 1200 

GAAGGGCCGA TTGTTAAGCT AAAGGATGGG AGTGTTGTTA GGGTTGATGA TTACAACTTG 1260 
GCCCTCAAAA TAAGGGATGA AGTCGAAGAG ATACTTTATT TGGGAGATGC AATCATAGCC 1320 
TTTGGAGACT TTGTGGAGAA CAATCAAACT CTCCTTCCTG CAAACTATGT AGAGGAGTGG 1380 

45 TGGATCCAAG AGTTCGTAAA GGCCGTTAAT GAGGCATATG AAGTTGAGCT TAGACCCTTT 1440 

GAGGAAAATC CCAGGGAGAG CGTTGAGGAA GCAGCAGAGT ACCTTGAAGT TGACCCAGAA 1500 

50 
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TTCTTGGCTA AGATGCTTTA CGATCCTCTA AGGGTTAAGC CTCCCGTGGA GCTAGCCATA 1560 
CACTTCTCGG AAATCCTGGA AATTCCTCTC CACCCATACT ACACCCTTTA TTGGAATACT 1620 

5 GTAAATCCTA AAGATGTTGA AAGACTTTGG GGAGTATTAA AAGACAAGGC CACCATAGAA 1680 

TGGGGCACTT TCAGAGGTAT AAAGTTTGCA AAGAAAATTG AAATTAGCCT GGACGACCTG 1740 
GGAAGTCTTA AGAGAACCCT AGAGCTCCTG GGACTTCCTC ATACGGTAAG AGAAGGGATT 1800 
GTAGTGGTTG ATTATCCGTG GAGTGCAGCT CTTCTCACTC CATTGGGCAA TCTTGAATGG 1860 

10 GAGTTTAAGG CCAAGCCCTT CTACACTGTA ATAGACATCA TTAACGAGAA CAATCAGATA 1920 

AAGCTCAGGG ACAGGGGAAT AAGCTGGATA GGGGCAAGAA TGGGAAGGCC AGAGAAGGCA 1980 
AAAGAAAGAA AAATGAAGCC ACCTGTTCAA GTCCTCTTCC CAATTGGCTT GGCAGGGGGT 2040 
TCTAGCAGAG ATATAAAGAA GGCTGCTGAA GAGGGAAAAA TAGCTGAAGT TGAGATTGCT 2100 
TTCTTCAAGT GTCCGAAGTG TGGCCATGTA GGGCCTGAAA CTCTCTGTCC CGAGTGTGGG 2160 
ATTAGGAAAG AGTTGATATG GACATGTCCC AAGTGTGGGG CTGAATACAC CAATTCCCAG 2220 
GCTGAGGGGT ACTCGTATTC ATGTCCAAAG TGCAATGTGA AGCTAAAGCC ATTCACAAAG 2280 

20 AGGAAGATAA AGCCCTCAGA GCTCTTAAAC AGGGCCATGG AAAACGTGAA GGTTTATGGA 2340 

GTTGACAAGC TTAAGGGCGT AATGGGAATG ACTTCTGGCT GGAAGATTGC AGAGCCGCTG 2400 
GAGAAAGGTC TTTTGAGAGC AAAAAATGAA GTTTACGTCT TTAAGGATGG AACCATAAGA 2460 
TTTGATGCCA CAGATGCTCC AATAACTCAC TTTAGGCCTA GGGAGATAGG AGTTTCAGTG 2520 

25 GAAAAGCTGA GAGAGCTTGG CTACACCCAT GACTTCGAAG GGAAACCTCT GGTGAGTGAA 2580 

GACCAGATAG TTGAGCTTAA GCCCCAAGAT GTAATCCTCT CAAAGGAGGC TGGCAAGTAC 2640 
CTCTTAAGAG TGGCCAGGTT TGTTGATGAT CTTCTTGAGA AGTTCTACGG ACTTCCCAGG 2700 
TTCTACAACG CCGAAAAAAT GGAGGATTTA ATTGGTCACC TAGTGATAGG ATTGGCCCCT 2760 

30 

CACACTTCAG CCGGAATCGT GGGGAGGATA ATAGGCTTTG TAGATGCTCT GGTTGGCTAC 2820 
GCTCACCCCT ACTTCCATGC GGCCAAGAGA AGGAACTGTG ATGGAGATGA GGATAGTGTA 2880 
ATGCTACTCC TTGATGCCCT ATTGAACTTC TCCAGATACT ACCTCCCCGA AAAAAGAGGA 2940 
GGAAAAATGG ACGCTCCTCT TGTCATAACC ACGAGGCTTG ATCCAAGAGA GGTGGACAGT 3000 

35 

GAAGTGCACA ACATGGATGT CGTTAGATAC TATCCATTAG AGTTCTATGA AGCAACTTAC 3060 
GAGCTTAAAT CACCAAAGGA ACTTGTGAGA GTTATAGAGG GAGTTGAAGA TAGATTAGGA 3120 
AAGCCTGAAA TGTATTACGG AATAAAGTTC ACCCACGATA CCGACGACAT AGCTCTAGGA 3180 

40 CCAAAGATGA GCCTCTACAA GCAGTTGGGA GATATGGAGG AGAAAGTGAA GAGGCAATTG 3240 

ACATTGGCAG AGAGAATTAG AGCTGTGGAT CAACACTATG TTGCTGAAAC AATCCTCAAC 3300 
TCCCACTTAA TTCCCGACTT GAGGGGTAAC CTAAGGAGCT TTACTAGACA AGAATTTCGC 3360 
TGTGTGAAGT GTAACACAAA GTACAGAAGG CCGCCCTTGG ATGGAAAATG CCCAGTCTGT 3420 

45 GGAGGAAAGA TAGTGCTGAC AGTTAGCAAA GGAGCCATTG AAAAGTACTT GGGGACTGCC 3480 

AAGATGCTCG TAGCTAACTA CAACGTAAAG CCATATACAA GGCAGAGAAT ATGCTTGACG 3540 

50 
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GAGAAGGATA TTGATTCACT CTTTGAGTAC TTATTCCCAG AAGCCCAGTT AACGCTCATT 3600 
GTAGATCCAA ACGACATCTG TATGAAAATG ATCAAGGAAA GAACGGGGGA AACAGTTCAA 3660 
5 GGAGGCCTGC TTGAGAACTT TAATTCCTCT GGAAATAATG GGAAGAAAAT AGAGAAGAAG 3720 

GAGAAAAAGG CAAAGGAAAA GCCTAAAAAG AAGAAAGTTA TAAGCTTGGA CGACTTCTTC 3780 
TCCAAACGC 3789 

w SEQ ID NO: 5 

SEQUENCE LENGTH: 8450 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 
15 TOPOLOGY: linear 

MOLECULAR TYPE: Genomic DNA 

SEQUENCE DESCRIPTION: 

CATAACTAAA TTATTACATT TAGTTATATG GATGGGGGAA AAATTAACAA CATGTGTTAT 60 

20 

GTTTCCTCTG GAAAATTGAT CTATAATAAT CTAGGAGCAC AATTTCCAAT GGAGGGTCAT 120 
CAATGAACGA AGGTGAACAT CAAATAAAGC TTGACGAGCT ATTCGAAAAG TTGCTCCGAG 180 
CTAGGAAGAT ATTCAAAAAC AAAGATGTCC TTAGGCATAG CTATACTCCC AAGGATCTAC 240 

25 CTCACAGACA TGAGCAAATA GAAACTCTCG CCCAAATTTT AGTACCAGTT CTCAGAGGAG 300 

AAACTCCATC AAACATATTC GTTTATGGGA AGACTGGAAC TGGAAAGACT GTAACTGTAA 360 
AATTTGTAAC TGAAGAGCTG AAAAGAATAT CTGAAAAATA CAACATTCCA GTTGATGTGA 420 
TCTACATTAA TTGTGAGATT GTCGATACTC ACTATAGAGT TCTTGCTAAC ATAGTTAACT 480 

30 ACTTCAAAGA TGAGACTGGG ATTGAAGTTC CAATGGTAGG TTGGCCTACC GATGAAGTTT 540 

ACGCAAAGCT TAAGCAGGTT ATAGATATGA AGGAGAGGTT TGTGATAATT GTGTTGGATG . 600 
AAATTGACAA GTGGTAAAGA AGAGTGGTGA TGAGGTTCTC TATTCATTAA CAAGAATAAA 660 
TACTGAACTT AAAAGGGCTA AAGTGAGTGT AATTGGTATA TCAAACGACC TTAAATTTAA 720 

35 

AGAGTATCTA GATCCAAGAG TTCTCTCAAG TTTGAGTGAG GAAGAGGTGG TATTCCCACC 780 
CTATGATGCA AATCAGCTTA GGGATATACT GACCCAAAGA GCTGAAGAGG CCTTTTATCC 840 
TGGGGTTTTA GACGAAGGTG TGATTCCCCT CTGTGCAGCA TTAGCTGCTA GAGAGCATGG 900 

40 AGATGCAAGA AAGGCACTTG ACCTTCTAAG AGTTGCAGGG GAAATAGCGG AAAGAGAAGG 960 

GGCAAGTAAA GTAACTGAAA AGCATGTTTG GAAAGCCCAG GAAAAGATTG AACAGGACAT 1020 
GATGGAGGAG GTAATAAAAA CTCTACCCCT TCAGTCAAAA GTTCTCCTCT ATGCCATAGT 1080 
TCTTTTGGAC GAAAACGGCG ATTTACCAGC AAATACTGGG GATGTTTACG CTGTTTATAG 1140 

« GGAATTGTGC GAGTACATTG ACTTGGAACC TCTCACCCAA AGAAGGATAA GTGATCTAAT 1200 

. TAATGAGCTT GACATGCTTG GAATAATAAA TGCAAAAGTT GTTAGTAAGG GGAGATATGG 1260 

50 
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GAGGACAAAG GAAATAAGGC TTAACGTTAC 
TGATTACTCT ATTCAGCCCC TCCTCACAAT 

5 CTAATGGATG AATTTGTAAA ATCACTTCTA 

TACTATCTCT TGAGAGAATA CTATGAAAAA 
TTTGCAAGAT CAAGAGAGAG CTACATAATT 
GTTAAAGGCC TTGAAGCAAT TCTTCCAGTG 

W GAGTCCCAAA AAGAGCAGTC TTATGAAGAG 

GAGATTAAAG AAGGAGAGAG TTTTATTTCC 
AATAGCATTG GAATTGAGGA AATTGGGGCA 
AAT GGTGGAG AGGCAATTGT CTTTGACAAA 
GAAATAGAGG TTGAGGAGAA GGAGTACTCG 
CCCGACTTCA ATTATGTGGA AATAAAGGAA 
GTAAAGCTGA AGCCTCCTAA GGTAAAGAAC 

20 GAAGCTTATG CTTCTCTCTT CAGGAGTAGG 

AATCCTGAAT TGGACAATGT TGTTGATATT 
ACCGTGACAA TAATAGGGCT TGTCAATTCC 
GAAATAGAAG ATCTCACAGG AAAGGTTAAA 

25 AGGGAGGCAT TTAAGGTTCT TCCAGATGCC 

AGGGGAATTT TGTACGCCAA CAAGTTTTAC 
AAGCCTCCAC TGGAAGAGAA AGTTTATGCT 
AAAGAGTTCT GCGAAAATGC CTTCATAAAG 

30 

ACTAAGGAAG AGGAAGAAAT CGTGAGTAGG 
GTTGATGGTG TTGGCGTTTA TCCGGGCCAG 
GACCAGTATG AGGCCCTCGC AAACCTTCTC 

35 ATTGCCCCAG GAAACCACGA TGCTGCTAGG 

GAGTATGCAA AACCTATATA CAAGCTCAAG 
ATAAGACTAC ATGGTAGGGA CTTTCTGATA 
GGAAGTGTTC CTGGGTTGAC CCATCACAAG 

40 ATGAGGCATG TAGCTCCAAT GTTTGGAGGA 

TTGCTTGTTA TAGAAGAAGT TCCTGATGTA 
GCGGTAGTTT ATAGGGGAGT TCAGCTGGTT 
TTCCAGAAGA TGGTGAACAT AGTTCCAACG 

45 

ACTGCAAAAG TTGTCAAGGT TTTGGACTTT 
AAATTGAGGA GTATTTTGAG ATGCTTCAAA 

SO 



CTCATATAAG 


ATAAGAAATG 


TGCTGAGATA 


1320 


TTCCCTTAAG 


AGTGAGCAGA 


GGAGGTTGAT 


1380 


AAAGCTAACT 


ATCTAATAAC 


TCCCTCTGCC 


1440 


GGTGAATTCT 


CAATTGTGGA 


GCTGGTAAAA 


1500 


ACTGATGCTT 


TAGCAACAGA 


ATTCCTTAAA 


1560 


GAAACAAAGG 


GGGGTTTTGT 


TTCCACTGGA 


1620 


TCTTTTGGGA 


CTAAAGAAGA 


AATTTCCCAG 


1680 


ACTGGAAGTG 


AACCACTTGA 


AGAGGAGCTC 


1740 


AATGAAGAGT 


TAGTTTCTAA 


TGGAAATGAC 


1800 


TATGGCTATC 


CAATGGTATA 


TGCTCCAGAA 


1860 


AAGTATGAAG 


ATCTGACAAT 


ACCCATGAAC 


1920 


GATTATGATG 


TTGTCTTCGA 


TGTTAGGAAT 


1980 


GGTAATGGGA 


AGGAAGGTGA 


AATAATTGTT 


2040 


TTGAAGAAGT 


TAAGGAAAAT 


ACTAAGGGAA 


2100 


GGGAAGCTGA 


AGTATGTGAA 


GG AAGATGAA 


2160 


AAGAGGGAAG 


TGAATAAAGG 


ATTGATATTT 


2220 


GTTTTCTTGC 


CGAAAGATTC 


GG AAGATTAT 


2280 


GTCGTCGCTT 


TTAAGGGGGT 


GTATTCAAAG 


2340 


CTTCCAGACG 


TTCCCCTCTA 


TAGGAGACAA 


2400 


ATTCTCATAA 


GTGATATACA 


CGTCGGAAGT 


2460 


TTCTTAGAGT 


GGCTCAATGG 


AAACGTTGAA 


2520 


GTTAAGTATC 


TAATCATTGC 


AGGAGATGTT 


2580 


TATGCCGACT 


TGACGATTCC 


AGATATATTC 


2640 


TCTCACGTTC 


CTAAGCACAT 


AACAATGTTC 


2700 


CAAGCTATTC 


CCCAACCAGA 


ATTCTACAAA 


2760 


AACGCCGTGA 


TAATAAGCAA 


TCCTGCTGTA 


2820 


GCTCATGGTA 


GGGGGATAGA 


GGATGTCGTT 


2880 


CCCGGCCTCC 


CAATGGTTGA 


ACTATTGAAG 


2940 


AAGGTTCCAA 


TAGCTCCTGA 


TCCAGAAGAT 


3000 


GTTCACATGG 


GTCACGTTCA 


CGTTTACGAT 


3060 


AACTCCGCCA 


CCTGGCAGGC 


TCAGACCGAG 


3120 


CCTGCAAAGG 


TTCCCGTTGT 


TGATATTGAT 


3180 


AGTGGGTGGT 


GCTGATGGAG 


CTTCCAAAGG 


3240 


GGGAAATTGA 


CAAAGCTTAC 


GAGATTGCTA 


3300 
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AGAAGGCTAG GAGTCAGGGT AAAGACCCCT 
ACATGGCTGG AAGAGTTGAG AGCTTAGTTG 

5 AGCTTTTAAA AGAGTATGAT AAGGAAATTG 

AGGGCAAATT TGGTGATTTT GGAAGTAAAG 
CCTTGGCAAT ATTAACTGAG GGTATTGTTT 
AAATCAAGCG AAACACCTGG GCTGATAACT 

10 CAATTAGGAG TTCTGGTGGA ACTGCTCAAG 

GGCGAAAGCT TGGCCTTGAT AGGTTTAAGC 
AGGAAGTTGA CCTCTATCAT AGAGCTGTTT 
AAGTGAGATT AGCAATGAGG AATATTCCCA 

15 

TGGAGGTTTC CCATAGAGAT GTAGAGGGAG 
TCCTAGTTTT GGCGGAGGGT GTTCTCCAGA 
AGATGGGGAT TGATGGATGG GAGTGGCTTA 

20 AAGAAATCGA AGAGAGTGAA AGTAAAGCCG 

TAGAGAAGGG ATTCTACTAC AAGCTCTATG 
AAAAGTATGC AAAGGAAATA ATTGGTGGGA 
GGGGATTTAG GCTTAGATAT GGTAGAAGTA 

25 ATCCAGCAAC AATGGTTTTG GTTGACGAGT 

AGAGGCCTGG GAAAGGTGCA GTAGTGACTC 
AGCTAAAGGA TGGGAGTGTT GTTAGGGTTG 
ATGAAGTCGA AG AG AT ACT T TATTTGGGAG 

30 AGAACAATCA AACTCTCCTT CCTGCAAACT 

TAAAGGCCGT TAATGAGGCA TATGAAGTTG 
AGAGCGTTGA GGAAGCAGCA GAGTACCTTG 
TTTACGATCC TCTAAGGGTT AAGCCTCCCG 

35 

TGGAAATTCC TCTCCACCCA TACTACACCC 
TTGAAAGACT TTGGGGAGTA TTAAAAGACA 
GTATAAAGTT TGCAAAGAAA ATTGAAATTA 

40 CCCTAGAGCT CCTGGGACTT CCTCATACGG 

CGTGGAGTGC AGCTCTTCTC ACTCCATTGG 
CCTTCTACAC TGTAATAGAC ATCATTAACG 
GAATAAGCTG GATAGGGGCA AGAATGGGAA 

45 AGCCACCTGT TCAAGTCCTC TTCCCAATTG 

AGAAGGCTGC TGAAGAGGGA AAAATAGCTG 



CAACCGATGT 


TGAGATTCCC 


CAGGCTACAG 


3360 


GCCCTCCCGG 


AGTTGCTCAG 


AGAATTAGGG 


3420 


TTGCTTTAAA 


GATAGTTGAT 


GAGATAATTG 


3480 


AGAAGTACGC 


TGAACAGGCT 


GTAAGGACAG 


3540 


CTGCTCCACT 


TGAGGGTATA 


GCTGATGTTA 


3600 


CTGAATACCT 


CGCCCTTTAC 


TATGCTGGGC 


3660 


CTCTCAGTGT 


ACTTGTTGGT 


GATTACGTTA 


3720 


CAAGTGGGAA 


GCATATAGAG 


AGAATGGTTG 


3780 


CAAGGCTTCA 


ATATCATCCC 


TCACCTGATG 


3840 


TAGAAATCAC 


TGGTGAAGCC 


ACTGACGATG 


3900 


TTGAGACAAA 


TCAGCTGAGA 


GGAGGAGCGA 


3960 


AGGCTAAAAA 


GCTCGTGAAA 


TACATTGACA 


4020 


AAGAGTTTGT 


AGAGGCTAAA 


GAAAAAGGTG 


4080 


AGGAGTCAAA 


AGTTGAAACA 


AGGGTGGAGG 


4140 


AG A A ATT TAG 


GGCTGAGATT 


GCCCCAAGCG 


4200 


GGCCGTTATT 


CGCTGGACCC 


TCGGAAAATG 


4260 


GGGTGAGTGG 


ATTTGCAACA 


TGGAGCATAA 


4320 


TCTTGGCCAT 


TGGAACTCAA 


ATGAAAACCG 


4380 


CAGCAACAAC 


CGCTGAAGGG 


CCGATTGTTA 


4440 


ATGATTACAA 


CTTGGCCCTC 


AAAATAAGGG 


4500 


ATGCAATCAT 


AGCCTTTGGA 


GACTTTGTGG 


4560 


ATGTAGAGGA 


GTGGTGGATC 


CAAGAGTTCG 


4620 


AGCTTAGACC 


CTTTGAGGAA 


AATCCCAGGG 


4680 


AAGTTGACCC 


AGAATTCTTG 


GCTAAGATGC 


4740 


TGGAGCTAGC 


CATACACTTC 


TCGGAAATCC 


4800 


TTTATTGGAA 


TACTGTAAAT 


CCTAAAGATG 


4860 


AGGCCACCAT 


AGAATGGGGC 


ACTTTCAGAG 


4920 




rfTGGGAAGT 


CTTAAGAGAA 

\* x x nnwnwnn 


4980 


TAAGAGAAGG 


GATTGTAGTG 


GTTGATTATC 


5040 


GCAATCTTGA 


ATGGGAGTTT 


AAGGCCAAGC 


5100 


AGAACAATCA 


GATAAAGCTC 


AGGGACAGGG 


5160 


GGCCAGAGAA 


GGCAAAAGAA 


AGAAAAATGA 


5220 


GCTTGGCAGG 


GGGTTCTAGC 


AGAGATATAA 


5280 


AAGTTGAGAT 


TGCTTTCTTC 


AAGTGTCCGA 


5340 



55 
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AGTGTGGCCA 


TGTAGGGCCT GAAACTCTCT GTCCCGAGTG 


TGGGATTAGG 


AAAGAGTTGA 


5400 




TATGGACATG 


TCCCAAGTGT GGGGCTGAAT ACACCAATTC 


CCAGGCTGAG 


GGGTACTCGT 


5460 


5 


ATTCATGTCC 


AAAGTGCAAT GTGAAGCTAA AGCCATTCAC 


AAAGAGGAAG 


ATAAAGCCCT 


5520 




CAGAGCTCTT 


AAACAGGGCC ATGGAAAACG TGAAGGTTTA 


TGGAGTTGAC 


AAGCTTAAGG 


5580 




GCGTAATGGG 


AATGACTTCT GGCTGGAAGA TTGCAGAGCC 


GCTGGAGAAA 


GGTCTTTTGA 


5640 




GAGCAAAAAA 


TGAAGTTTAC GTCTTTAAGG ATGGAACCAT 


AAGATTTGAT 


GCCACAGATG 


5700 


10 


CTCCAATAAC 


TCACTTTAGG CCTAGGGAGA TAGGAGTTTC 


AGTGGAAAAG 


CTGAGAGAGC 


5760 




TTGGCTACAC 


CCATGACTTC GAAGGGAAAC CTCTGGTGAG 


TGAAGACCAG 


ATAGTTGAGC 


5820 




TTAAGCCCCA 


AGATGTAATC CTCTCAAAGG AGGCTGGCAA 


GTACCTCTTA 


AGAGTGGCCA 


5880 


15 


GGTTTGTTGA 


TGATCTTCTT GAGAAGTTCT ACGGACTTCC 


CAGGTTCTAC 


AACGCCGAAA 


5940 


AAATGGAGGA 


TTTAATTGGT CACCTAGTGA TAGGATTGGC 


CCCTCACACT 


TCAGCCGGAA 


6000 




TCGTGGGGAG 


GATAATAGGC TTTGTAGATG CTCTGGTTGG 


CTACGCTCAC 


CCCTACTTCC 


6060 




ATGCGGCCAA 


GAGAAGGAAC TGTGATGGAG ATGAGGATAG 


TGTAATGCTA 


CTCCTTGATG 


6120 


20 


CCCTATTGAA 


CTTCTCCAGA TACTACCTCC CCGAAAAAAG 


AGGAGGAAAA 


ATGGACGCTC 


6180 




CTCTTGTCAT 


AACCACGAGG CTTGATCCAA GAGAGGTGGA 


CAGTGAAGTG 


CACAACATGG 


6240 




ATGTCGTTAG 


ATACTATCCA TTAGAGTTCT ATGAAGCAAC 


TTACGAGCTT 


AAATCACCAA 


6300 




AGGAACTTGT 


GAG AGT TATA GAGGGAGTTG AAGATAGATT 


AGGAAAGCCT 


GAAATGTATT 


6360 


25 


ACGGAATAAA 


GTTCACCCAC GATACCGACG ACATAGCTCT 


AGGACCAAAG 


ATGAGCCTCT 


6420 




AC AAGC AG T T 


GGGAGATATG GAGGAGAAAG TGAAGAGGCA 


ATTGACATTG 


GCAGAGAGAA 


6480 




TTAGAGCTGT 


GGATCAACAC TATGTTGCTG AAACAATCCT 


CAACTCCCAC 


TTAATTCCCG 


6540 




ACTTGAGGGG 


TAACCTAAGG AGCTTTACTA GACAAGAATT 


TCGCTGTGTG 


AAGTGTAACA 


6600 


30 


CAAAGTACAG 


AAGGCCGCCC TTGGATGGAA AATGCCCAGT 


CTGTGGAGGA 


AAGATAGTGC 


6660 




TGACAGTTAG 


CAAAGGAGCC ATTGAAAAGT ACTTGGGGAC 


TGCCAAGATG 


CTCGTAGCTA 


6720 




ACTACAACGT 


AAAGCCATAT ACAAGGCAGA GAATATGCTT 


GACGGAGAAG 


GATATTGATT 


6780 


35 


CACTCTTTGA 


GTACTTATTC CCAGAAGCCC AGTTAACGCT 


CATTGTAGAT 


CCAAACGACA 


6840 


TCTGTATGAA 


AATGATCAAG GAAAGAACGG GGGAAACAGT 


TCAAGGAGGC 


CTGCTTGAGA 


6900 




ACTTTAATTC 


CTCTGGAAAT AATGGGAAGA AAATAGAGAA 


GAAGGAGAAA 


AAGGCAAAGG 


6960 




AAA kfZCCT A A 


AAAf?AAf?AAA fSTT AT A AftPT TfJfSAnfJACTT 


CTTCTCCAAA 


CGCTGACCAC 


7020 


40 


AACTTTTAAG 


TTCTTTCTTG AGAATAAATT CCCAGGTGGC 


TTAGAGAATG 


AAGATTGTGT 


7080 




GGTGTGGTCA 


TGCCTGCTTC TTGGTGGAGG ATAGGGGGAC 


TAAGATACTA 


ATCGATCCAT 


7140 




ACCCAGACGT 


TGATGAAGAC AGAATAGGCA AGGTCGATTA 


CATTCTAGTT 


ACCCACGAGC 


7200 




ACATGGATCA 


CTACGGTAAG ACCCCACTAA TAGCAAAGCT 


CAGTGATGCC 


GAGGTTATAG 


7260 


45 


GGCCGAAAAC 


AGTTTATCTC ATGGCAATAA GTGATGGGCT 


AACAAAGGTC 


AGAGAGATAG 


7320 




AGGTGGGACA 


GGAAATCGAG CTGGGAGATA TTAGGGTTAG 


GGCATTTTTC 


ACAGAGCATC 


7380 



50 



55 
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CAACAAGCCA 


GTATCCCCTG 


GGATATCTAA 


TTGAAGGAAG 


CAAAAGAGTG 


GCTCACTTGG 


7440 


GAGATACATA 


CTACAGTCCA 


GCTTTTACAG 


AGTTGAGGGG 


AAAGGTTGAT 


GTTCTTTTGG 


7500 


TTCCAATAGG 


TGGGAAGTCC 


ACCGCTAGTG 


TAAGGGAGGC 


TGCGGATATA 


GTGGAGATGA 


7560 


TAAGGCCCAG 


GATAGCAGTT 


CCAATGCACT 


ATGGAACGTA 


CAGCGAGGCC 


GATCCTGAAG 


7620 


AGTTCAAGAA 


GGAGCTCCAA 


AAAAGGCGCA 


TATGGGTTTT 


AGTAAAGGAT 


CTTAAGCCCT 


7680 


ATGAGGGTTT 


TGAAATCTGA 


AGGTGTTTCA 


ATGCTAAATA 


CTGAGCTCTT 


AACCACTGGA 


7740 


GTCAAGGGGT 


TAGATGAGCT 


TTTAGGTGGT 


GGAGTTGCTA 


AGGGAGTAAT 


ACTCCAAGTT 


7800 


TACGGGCCAT 


TTGCCACCGG 


GAAGACAACT 


TTTGGAATGC 


AGGTTGGATT 


ATTGAATGAG 


7860 


GGAAAAGTGG 


CTTATGTTGA 


TACTGAGGGG 


GGATTCTCCC 


CCGAAAGGTT 


AGCTCAAATG 


7920 


GCAGAATCAA 


GGAACTTGGA 


TGTGGAGAAA 


GCACTTGAAA 


AGTTCGTGAT 


ATTCGAACCT 


7980 


ATGGATTTAA 


ACGAGCAAAG 


ACAGGTAATT 


GCGAGGTTGA 


AAAATATCGT 


GAATGAAAAG 


8040 


TTTTCTTTAG 


TTGTGGTCGA 


CTCCTTTACG 


GCCCATTATA 


GAGCGGAGGG 


GAGTAGAGAG 


8100 


TATGGAGAAC 


TTTCCAAGCA 


ACTCCAAGTT 


CTTCAGTGGA 


TTGCCAGAAG 


AAAAAACGTT 


8160 


GCCGTTATAG 


TTGTCAATCA 


AGTTTATTAC 


GATTCAAACT 


CAGGAATTCT 


TAAACCAATA 


8220 


GCTGAGCACA 


CCCTGGGGTA 


CAAAACAAAG 


GACATCCTCC 


GCTTTGAAAG 


GCTTAGGGTT 


8280 


GGAGTGAGAA 


TTGCAGTTCT 


GGAAAGGCAT 


AGGTTTAGGC 


CAGAGGGTGG 


GATGGTATAC 


8340 


TTCAAAATAA 


CAGATAAAGG 


ATTGGAGGAT 


GTAAAAAACG 


AAGATTAGAG 


CCTGTCGTAG 


8400 


ACCTCCTGGG 


CAATCCTCAG 


CGTTGCCTTA 


TAGAGCTTCT CACTAATAAT 


8450 



SEQ ID NO: 6 
SEQUENCE LENGTH: 45 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 

MOLECULAR TYPE: other nucleic acid ( synthetic' DNA) 
SEQUENCE DESCRIPTION: 

CCGGAACCGC CTCCCTCAGA GCCGCCACCC TCAGAACCGC CACCC 45 

SEQ ID NO: 7 
SEQUENCE LENGTH: 17 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 

MOLECULAR TYPE: other nucleic acid (synthetic RNA) 
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SEQUENCE DESCRIPTION: 
GUUUUCCCAG UCACGAC 

SEQ ID NO: 8. 
SEQUENCE LENGTH: 23 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 

MOLECULAR TYPE: other nucleic acid (synthetic DNA) 
SEQUENCE DESCRIPTION: 
GATGAGTTCG TGTCCGTACA ACT 

SEQ ID NO: 9 
SEQUENCE LENGTH: 22 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 

MOLECULAR TYPE: other nucleic acid (synthetic DNA) 
SEQUENCE DESCRIPTION: 
ACAAAGCCAG CCGGAATATC TG 

SEQ * ID NO: 10 
SEQUENCE LENGTH: 22 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 

MOLECULAR TYPE: other nucleic acid (synthetic DNA) 
SEQUENCE DESCRIPTION: 
TACAATACGA TGCCCCGTTA AG 

SEQ ID NO: 11 
SEQUENCE LENGTH: 32 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: single 
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TOPOLOGY: linear 

MOLECULAR TYPE: other nucleic acid (synthetic DNA) 
5 SEQUENCE DESCRIPTION: 

CAGAGGAGGT TGATCCCATG GATGAATTTG TA 

SEQ ID NO: 12 
10 SEQUENCE LENGTH : 32 

SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 

15 

MOLECULAR TYPE: other nucleic acid (synthetic DNA) 

SEQUENCE DESCRIPTION: 

TTTAGTGGGT GGTGCCCATG GAGCTTCCAA AG 



25 Claims 

1 . A DNA polymerase characterized in that said DNA polymerase possesses the following properties: 

1) exhibiting higher polymerase activity when assayed by using as a substrate a complex resulting from primer 
30 annealing to a single stranded template DNA, as compared to the case where an activated DNA is used as a 

substrate; 

2) possessing a 3'->5' exonuclease activity; 

3) being capable of amplifying a DNA fragment of about 20 kbp, in the case where polymerase chain reaction 
(PCR) is carried out using X-DNA as a template under the following conditions: 

35 PCR conditions: 

(a) a composition of reaction mixture: containing 10 mM Tris-HCI (pH 9.2), 3.5 mM MgCI 2 , 75 mM KCI, 400 
joM each of dATP, dCTP, dGTP and dTTP, 0.01% bovine serum albumin, 0.1% Triton X-100, 5.0 ng/50 ul 
A.- DNA, 10 pmole/50 ul primer A.1 (SEQ ID NO:8 in Sequence Listing), primer X11 (SEQ ID NO:9 in 

40 Sequence Listing), and 3.7 units/50 ul DNA polymerase; 

(b) reaction conditions: carrying out a 30-cycle PCR, wherein one cycle is defined as at 98°C for 10 sec- 
onds and at 68°C for 10 minutes. 

2. The DNA polymerase according to claim 1 , characterized in that said DNA polymerase exhibits a lower error rate 
45 in DNA synthesis as compared to Taq DNA polymerase. 

3. The DNA polymerase according to claim 1 or 2, wherein the molecular weight as determined by gel filtration- 
method is about 220 kDa or about 385 kDa. 

so 4. The DNA polymerase according to any one of claims 1 to 3, characterized in that said DNA polymerase exhibits an 
activity under coexistence of two kinds of DNA polymerase-constituting protein, a first DNA polymerase-constitut- 
ing protein and a second DNA polymerase-constituting protein. 

5. The DNA polymerase according to claim 4, characterized in that the molecular weights of said first DNA polymer- 
55 ase-constituting protein and said second DNA polymerase-constituting protein are about 90,000 Da and about 

140,000 Da as determined by SDS-PAGE, respectively. 

6. The DNA polymerase according to claim 4 or 5, characterized in that said first DNA polymerase-constituting protein 
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which constitutes the DNA polymerase according to claim 4 or 5 comprises the amino acid sequence as shown by 
SEQ ID NO:1 in Sequence Listing, or is a functional equivalent thereof possessing substantially the same activity 
which results from deletion, insertion, addition or substitution of one or more amino acids in said amino acid 
sequence. 

5 

7. The DNA polymerase according to claim 4 or 5, characterized in that said second DNA polymerase-constituting 
protein which constitutes the DNA polymerase according to claim 4 or 5 comprises the amino acid sequence as 
shown by SEQ ID NO:3 in Sequence Listing, or is a functional equivalent thereof possessing substantially the same 
activity which results from deletion, insertion, addition or substitution of one or more amino acids in said amino acid 

w sequence. 

8. The DNA polymerase according to claim 4 or 5, characterized in that said first DNA polymerase-constituting protein 
which constitutes the DNA polymerase according to claim 4 or 5 comprises the amino acid sequence as shown by 
SEQ ID NO:1 in Sequence Listing, or is a functional equivalent thereof possessing substantially the same activity 

is which results from deletion, insertion, addition or substitution of one or more amino acids in said amino acid 
sequence, and that said second DNA polymerase-constituting protein which constitutes the DNA polymerase 
according to claim 4 or 5 comprises the amino acid sequence as shown by SEQ ID NO:3 in Sequence Listing, or 
is a functional equivalent thereof possessing substantially the same activity which results from deletion, insertion, 
addition or substitution of one or more amino acids in said amino acid sequence. 

20 

9. A first DNA polymerase-constituting protein which constitutes the DNA polymerase according to claim 4 or 5, 
wherein said first DNA polymerase-constituting protein comprises the amino acid sequence as shown by SEQ ID 
NO:1 , or an amino acid sequence resulting from deletion, insertion, addition or substitution of one or more amino 
acids in said amino acid sequence as a functional equivalent thereof possessing substantially the same activity. 

25 

10. A second DNA polymerase-constituting protein which constitutes the DNA polymerase according to claim 4 or 5, 
wherein said second DNA polymerase-constituting protein comprises the amino acid sequence as shown by SEQ 
ID NO:3, or an amino acid sequence resulting from deletion, insertion, addition or substitution of one or more amino 
acids in said amino acid sequence as a functional equivalent thereof possessing substantially the same activity. 

30 - . 

11. A DNA containing a base sequence encoding the first DNA polymerase-constituting protein according to claim 9, 
characterized in that said DNA comprises an entire sequence of a base sequence encoding the amino acid 
sequence as shown by SEQ ID NO:1 in Sequence Listing, or a partial sequence thereof, or that said DNA encodes 
a protein having an amino acid sequence resulting from deletion, insertion, addition or substitution of one or more 

35 amino acids in the amino acid sequence of SEQ ID NO:1 and possessing a function as the first DNA polymerase- 
constituting protein. 

12. A DNA containing a base sequence encoding the first DNA polymerase-constituting protein according to claim 9, 
characterized in that said DNA comprises an entire sequence of the base sequence as shown by SEQ ID NO:2 in 

40 Sequence Listing or a partial sequence thereof, or that said DNA comprises a base sequence capable of hybridiz- 
ing thereto under stringent conditions. 

13. A DNA containing a base sequence encoding the second DNA polymerase-constituting protein according to claim 
10, characterized in that said DNA comprises an entire sequence of a base sequence encoding the amino acid 

45 sequence as shown by SEQ ID NO:3, or a partial sequence thereof, or that said DNA encodes a protein having an 
amino acid sequence resulting from deletion, insertion, addition or substitution of one or more amino acids in the 
amino acid sequence of SEQ ID NO:3 and possessing a function as the second DNA polymerase-constituting pro- 
tein. 

so 14. A DNA containing a base sequence encoding the second DNA polymerase-constituting protein according to claim 
1 0, characterized in that said DNA comprises an entire sequence of the base sequence as shown by SEQ ID NO:4 
in Sequence Listing or a partial sequence thereof, or that said DNA comprises a base sequence capable of hybrid- 
izing thereto under stringent conditions. 

55 15. A method for producing a DNA polymerase, characterized in that the method comprises curturing a transfbrmant 
containing both gene encoding the first DNA polymerase-constituting protein according to claim 11 or 12, and gene 
encoding the second DNA polymerase-constituting protein according to claim 13 or 14, and collecting the DNA 
polymerase from the resulting culture. 
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16. A method for producing a DNA polymerase, characterized in that the method comprises culturing a translormant 
containing gene encoding the first DNA potymerase-constituting protein according to claim 1 1 or 12, and a trans- 
formant containing gene encoding the second DNA polymerase-constituting protein according to claim 13 or 14, 
separately; mixing DNA polymerase-constituting proteins contained in the resulting culture; and collecting the DNA 
5 polymerase. 
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